Apr 12

What Happened When We Created a Facebook App for Social Network Analysis

facebook-medFacebook and Mark Zuckerberg are getting blamed for a large number of issues from promoting fake news, election fraud, mishandling user data, and profiting from selling user data.

While some of that may be true, the Facebook security breach is actually a violation of Facebook API licensing rules by the people who used it. Facebook provided the data and encouraged developers like us to create innovative solutions for the Facebook ecosystem. They weren’t selling the data.They weren’t even charging us to use it.

Our Facebook App with Social Network Analysis and Maps

In 2010, we created a Facebook application using our Sentinel Visualizer technology to perform Social Network Analysis (SNA) based on a user’s friends’ friends. It would automatically cluster friends so you could quickly see their groups (high school, college, work, family, in-laws, clubs, etc.).

Facebook Social Network Analysis App of Clustered Friends

Each box (picture) was one of your friends, and you could move them around the network, hover over them to get their info, or click on them to go to their page.

We also plotted friends on a Microsoft Bing Map making it easy to see who were near you or where you were visiting.

Plotting Your Friends' Locations on a Map

 

We launched our free Sentinel Visualizer Facebook App to a limited number of users and it started to gain followers. People were amazed to see which of their friends knew each other. The application started to go viral. We were having trouble supporting the traffic.

Not Allowed to Save Facebook Data

One of the things developers couldn’t do was to save Facebook’s data. All we collected were the user names and email addresses people provided when they registered our program. Unfortunately, other developers didn’t abide by Facebook’s terms and the data improperly got to Cambridge Analytica and others.

Facebook Stopped Making the Data Available

Our app ceased to work when Facebook limited their APIs and prevented our ability to get to the list of your friends’ friends among your network.

It’s not entirely Facebook’s fault for trying to spur innovation by sharing their data for free. Some developers violated the trust Facebook gave them.

The Full Story

Here’s our new web page describing our experience in detail:

Jan 05

Sean Hannity Radio Show Interview on Julian Assange, WikiLeaks, Russian Hacking, and Cyber Warfare

sean-hannity-radioBackground

The day after an amazing personal interview of Julian Assange by Sean Hannity aired on his TV show, FMS President Luke Chung was invited to discuss the related technology on his radio show.

Hannity traveled to London to interview Julian Assange at the Ecuadorian Embassy where he’s seeking asylum. They discussed an overview of Assange’s role as founder of WikiLeaks, and their obtaining and publishing the emails from the Democratic National Committee the weeks before the US Presidential election. Some people attribute Hillary Clinton’s loss to the revelations in those emails especially from John Podesta, the former White House Chief of Staff and Chairman of the Clinton campaign. They are also accusing the Russians for hacking (stealing) and providing the data to Assange so Donald Trump could win the election.

Radio Show

On January 4, 2017, I was on the radio show with Sean Hannity and Brigadier General Eli Ben Meir, former Israeli Military Intelligence chief. The three of us discussed the WikiLeaks disclosures. I commented specifically on:

  • Cyber attacks and the security breach at OPM disclosed non-classified government employees and by omission who were covert at American Embassies globally.
  • Noting Julian Assange’s careful word choices to exclude Russia as source without excluding them as the ultimate source of his sources.
  • The need for WikiLeaks’ to keep their sources confidential and how they amplified the data from Bradley Manning and Edward Snowden.
  • Different approaches to preventing cyber attacks depending on the cause.
    “It’s one thing when someone steals your car because they broke into it. It’s another thing when someone steals your car because you left your keys in the ignition.”

Here’s the audio of the show:

My segment starts at the 6:50 mark. Sean and General Meir speak first, then I start around 9:25. Final comments at 14:15 and it wraps up by 14:50.

Additional Issues

Only a limited amount of information can be discussed in such a short interview. Some additional issues to consider are:

Data Security

Securing data over the internet and inside organizations is very challenging. Threats may come from:

  • External hacks that need to be monitored and defeated
  • Internal people who unintentionally leave the front door unlocked
  • Internal people who intentionally leak information

Different solutions are required for each type of threat. Some are at the software vendor, design, and developer level, while others involve end-user training, background checks, and monitoring.

Applications can be built so that simply disclosing a user name and password doesn’t compromise the whole system by require two-factor authentication and registering devices that can use those credentials.

Unfortunately, many systems were built well before today’s cyber threats existed. The cost of making those systems more secure without breaking their existing functionality will be daunting and expensive. In many cases, the original source code, development environment and/or vendor are long gone, so the only option is to replace them which is also very expensive and time consuming.

Julian Assange and WikiLeaks are Not Heroes

We need to keep in mind that the WikiLeaks exposed top secret US information by publishing the disclosures from Manning and Snowden. Lives were put at risk and lives may have been lost because of those publications. The Arab Spring was inflamed in part by the disclosure of diplomatic communications and one could argue the human tragedy in Syria is tied to this as well. While Republicans are celebrating and defending Assange and WikiLeaks now for the DNC emails, the tables may turn very quickly.

Data That’s Not Exposed May be More Dangerous

While many are focused on the DNC emails, it’s not unreasonable to assume the people who hacked that also got the RNC emails. Data can be power, and in the wrong hands, data can be used for nefarious purposes such as blackmail.

If the RNC data were compromised, we should be extremely worried if the hackers discovered it was more valuable to keep private than public. Whether they use it directly or sell it to another party or country, the information can make victims puppets by threatening the exposure of their personal data. It’s not uncommon during E-discovery of an email server to discover all sorts of inappropriate language, behavior and activity conducted by individuals in an organization. Disclosures of affairs, homosexual activity, underage sex, bribery, unethical business dealings, breeches of confidentiality, collusion, and actual crimes are often found in email threads and can be used for blackmail.

Additional Resources

Sep 23

Designing a Data Entry System Properly; Overhauling the Healthcare.gov Web Site

Healthcare.govSince my original impression that the debut of the Healthcare.gov web site was a technological disaster, I’ve contended that the website could be created for much cheaper, and be much easier to use than the mess that was delivered.

New York TimesThere finally seems to be progress in this direction according to today’s New York Times article, HealthCare.gov Is Given an Overhaul. I was quoted by Robert Pear:

“Instead of being user-friendly, the original website was user-hostile”

Basics of Data Entry Systems

We at FMS have created countless database systems where data entry played an important role. Unlike fancy graphics filled systems that look nice, data entry systems must be designed with a focus on ease-of-use by the end-user to enter, review, and update their information. If there are many questions and complex relationships, users need to be able to see as much of that on one screen as possible. If multiple screens are required, being able to move back and forth between screens without losing data and having changes in one screen reflected on others is critical for an efficient and intuitive user experience.

Data Entry Systems Should Target Users with Large Screens

For complex tasks such as writing a paper or working on a large spreadsheet, computers remain the preferred platform for getting work done where people can have one or multiple large screens. Serious data entry applications should target that user.

Mobile Apps Have a Role, but Not for Serious Data Entry

While mobile applications have a place, it’s not appropriate for complicated data entry since one question per screen is very inefficient. Not being able to see previous entries and pressing Next and Back for each question drives users crazy. The original designers of the Healthcare.gov web site designed it as if it were a simple, consumer mobile app meant to be filled out with a few finger clicks. They were either paid by the screen or just clueless about what a business data entry system requires.

Initial Request for Information Should be Anonymous

The purpose of the public facing Healthcare.gov website should be focused on helping prospects with the buying process. People need to quickly browse the health insurance options that are available to them in their state and cost estimates. The initial data entry should be the minimal anonymous information necessary to produce those results such as gender, age, zip code, family size, etc. Nothing personal such as names, social security numbers, email address, etc.

Automating a Paper Form

National ArvhivesOnly after customers have made a decision to buy should they be required (and expect) to provide more detailed information. This application feature is the core of the public facing Healthcare.gov website and is simply the automation of a 12 page paper form. It shouldn’t be that difficult.

We at FMS have automated paper forms for decades. Recently, we did this for a series of paper documents at the National Archives. The cost of doing this was in the tens of thousands of dollars, not the hundred of millions that Healthcare.gov cost.

Separating Data Entry from Complex Validation

A high volume, data entry system like Healthcare.gov should be designed to collect the user’s information as quickly as possible without trying to validate everything with other government systems in real-time. The cross-validation of information against IRS, HHS, Homeland Security, and other databases should happen in a background process that can withstand slowdowns or down times of dependent systems. This separates the complexity and risk of linking multiple systems together, manages the load on the other systems, and lets the user get done quicker. If a problem is detected later, an email can be sent to the user to fix the mistake or invalidate their application. Regardless, none of that needs to happen while the user is entering their data. After all, it’s not as if they were going to get insurance immediately upon pressing Submit.

Taxpayer Abuse

It remains shocking to me that it cost taxpayers hundreds of millions of dollars initially for the broken Healthcare.gov site, and hundreds of millions dollars afterwards to the same contractors to fix it. The procurement process and incentives are completely inverted for creating and delivering quality software. It’s outright theft, but no one seems to be held responsible for it, and lots of people profiting mightily from it.

Conclusion: Data Entry Systems Aren’t Difficult If You Know What You’re Doing

Logistics Support SystemI’ve contended that we at FMS could have created the public facing Healthcare.gov site for $1 million. Some people scoff at that, but in our world and that of our customers, $1 million still goes a long ways. We created an international humanitarian relief logistics system for the United Nations for half that amount, and it supports full language localization as it’s deployed in 80+ countries. Healthcare.gov didn’t even support Spanish when it debuted, and that was one of its original requirements.

Creating a good data entry system is not rocket science. This is not something that needs to be done in Silicon Valley. What’s needed is a team who’ve done it before and know what they’re doing. Creating this type of solution requires a solid database foundation, understanding the user needs, creating an intuitive user experience, and building it so that it’s maintainable over time. It’s not something that can be created by people on their first paid programming job, but it’s not a rare skill. I’m proud that my development team at FMS have been with me for decades and continue to deliver systems that just work.

Jan 27

Helping Create Living Wage Jobs with YearUp Featured on CBS 60 Minutes

YearUpOver the past few years, I’ve had the pleasure of working with and supporting the Arlington, Virginia chapter of YearUp. YearUp is a non-profit organization helping at risk youths get out of a lifetime in minimum wage jobs and toward a career path with a living wage.

They not only teach marketable skills, but supplement it with the personal and business soft skills necessary to be successful in business. They have particular focus teaching computer hardware skills, help desk, and basic finance. They understand and address employer needs: “We know you hire for skills, and fire for behavior in the work world.” By learning what companies and bosses expect, these youths are able to better understand what it means to be a professional, provide more value to their employers, and justify earning a higher salary.

Both FMS EVP Michelle Swann-Renee and I have met the students in person to discuss what employers seek and how to differentiate oneself positively in the workforce. As employers, we need people who arrive with skills we can’t train: honesty, work ethic, personal drive, high standards and expectations of one’s performance, getting along with others, ability to accept constructive criticism, writing and speaking skills, common sense, etc. Specific technical skills can be taught and change over time; those basic skills and character traits are difficult for a company to train. We’ve been impressed with the dedication of the staff and eagerness of the students to take the opportunity to learn and succeed. Those who make it through the program are very likely to be successful in a career and further education.

60MinutesLast night, YearUp was featured on the CBS 60 Minutes episode by Morley Safer: Jobs program aids Fortune 500 and underprivileged youth

Hope you get a chance to check out and support this program.

Dec 15

Who Thinks the Relaunched Healthcare.gov Performance Metrics are Acceptable?

Healthcare.gov

HealthcareTechnical Evaluation of the Relaunched Healthcare.gov Web Site

On December 1, a updated version of Healthcare.gov was deployed which offers considerable improvements over the original October 1 launch. The administration and contractors issued some press releases and the general public and press just accepted it without really understanding the technical issues. Here’s my technical assessment of the published statements.


My Assessment of Healthcare.gov, Version 1, on October 1

As I described in my original blog post about Healthcare.gov, the site on October 1 was a technical disaster. I received a lot of criticism with my original assessment from those who thought I had a political agenda against ACA or people who simply wished the site was functional independent of the facts.

My assessment on October 1 was eventually vindicated. It took a few weeks for the media, general public, and administration to recognize that the issues were far more problematic than the politically attractive excuse of having too many users.

Will the Contractors Ever be Held Accountable?

The contractors who built the system didn’t seem to know what they were doing and didn’t prioritize the need to build a functional system. I wrote a blog post summarizing how these large IT government contractors often abuse taxpayers: Too Big to Fire: How Government Contractors on Healthcare.gov Maximize Profits

Unfortunately, the contractors who delivered the flawed system on October 1 were rewarded with additional contracts and funds to fix the mess they created. Our federal government procurement process actually gives them more money from their failure than if they did a good job. It’s no wonder that large IT government contractors continue to deliver technically mediocre results. As long as they make sure their lawyers are more powerful than the government lawyers, they can deflect ALL blame so they can continue to use their “unblemished” past performance to go after new contracts. We will see if any of the contractors here are held accountable for this fiasco when they seek future business. This contractor behavior extends across all branches of government.

It would be amazing to interview the developers who actually worked on the original project, discover what their prior experience was, what they were being paid, and how much the taxpayers were billed for their “expertise”. The contractors are enforcing confidentiality rules to prevent those people from talking to the press in the “interest” of protecting taxpayers. I thnk it’s pretty clear which interests they’re trying to protect.


Enhancements in Version 2

When the administration recognized the technical disaster, they brought in Jeff Zients to lead the disaster relief team. It’s a small world. Mr. Zients and I actually worked at the same firm (SPA/Mercer) before I started FMS, though I left a few years before he joined. Through his leadership, he added some experienced people and reorganized the team while using the same contractors. They issued a Progress and Performance Report which summarized their work:

  • System Stability: Uptime consistently above 90%
  • Reduced Error Rates: per page system time outs or failures from 6+% to 0.75%
  • System Capacity: 50,000 simultaneous users, 20-30 minutes per user for 800K per day
  • Software Fixes: 400+ Bugs Eliminated
  • Hardware Upgrades
  • Real-time Monitoring: Dedicated team focused on site monitoring and instant incident response
  • Improved Response Times: from 8 seconds to under 1 second
  • There was also Improved Window Shopping for users.

To a layman, these results seem adequate. To anyone familiar with commercial software development, they are far below what we or any of our clients would consider acceptable. This is not what professional software developers should deliver, nor what taxpayers should accept.

Review of Relaunch Accomplishments

I’m quite surprised others haven’t provided a technical review of the December 1 relaunch:

System Metrics: 90% up time (One Nine Availability is Awful)
Why do people think 90% availability is acceptable? Even their data showing 95% is awful for a web site. That’s not equivalent to an A in class.

90% up time means it’s down 10% or 2.4 hours per day. 95% is still down an hour. Most web sites have hosting uptime based on the number of 9’s. For instance, 3 nines means 99.9% up time. There are 8760 hours per year (365 days x 24 hours per day). A 99.9% availability means it’s down 8 hours a year. 99.99% availability is less than one hour down per year. High volume commercial web sites strive for 5 nines or less than 10 minutes of down time per year.

I have never heard of any web site or client expecting or satisfied with one 9 availability.

Error Rates Below 1% is Still Pretty Bad
99% sounds good for a class exam, but it’s not good for software. How can a production web site have a 0.75% error rate? The rate seems to be based on the number of pages which is far worse than users. If it’s based on users, with the 50,000 capacity, that’s 375 errors. But when it’s based on pages, assuming each user goes through 50 pages, 18,750 of their 2.5 million pages fail. That means 37.5% of users crash (18,750 divided by 50,000).

Of more concern is the cause of the errors. Software either works or it doesn’t. It doesn’t randomly fail. Is the platform failing 0.75% of the time without knowing why? That would be disturbing and could indicate lots of different bugs. If the contractors don’t know what’s causing the crashes in their buggy code, that raises very serious security implications.

Or do they know if people perform certain tasks that the system will always crash, and they expect people to do that only 0.75% of the time? Still not good, but better.

Beyond crashing bugs, the site may run without crashing but fail to perform properly such as the problems submitting accurate data to the insurance companies. Those non-crashing failures aren’t even part of this error rate which is already too high for a production system.

Capacity of only 50,000?
This is a very strange metric. One usually measures website traffic based on number of page views or transactions. The number of users can be supported by adding more bandwidth and instances of the application on more servers. The capacity issues comes from what people are doing. If they are browsing static pages (not entering data), the number of simultaneous users should be much larger. Even if they are entering data, the capacity to save the data should be much higher than 50,000.

It’s not clear what is causing the 50,000 bottleneck. It shouldn’t be the front-end web application. That should be designed to efficiently save user inputs. The users aren’t entering a lot of information in the grand scheme of data entry systems.

A well designed application would separate the real-time user experience from the more capacity constrained data lookup requirements that may have bottlenecks caused by slow legacy systems at the IRS, HHS, INS, etc. This simply means that the user would enter their information quickly, the system would process it offline, and an email would notify them when the verifications were complete.

Capacity Limitations are Odd
The Healthcare.gov web site begs the use of a commercial cloud provider that can automatically support the fluctuating volumes of users. A web site needs to accommodate the largest number of users, not the average. The large volume spike is ahead of us on the deadline date December 23rd. Volumes would drop considerably after that. By using a commercial cloud provider like Microsoft Azure or Amazon EC2, there would be no need to buy hardware to accommodate huge spikes in users or unnecessary after peak times.

We suspect it’s more profitable for the contractors to buy the extra hardware and configure it poorly than to use commercial cloud providers who would provide a better service for lower costs and profits. The contractors may have also implemented features that for “security reasons”, prevent the use of a commercial cloud provider. It could have justified the creation of their own private system even though it probably decreased security given the crashes they’ve experienced.

Software Fixes and Test Plan
Fixing over 400 bugs is obviously a very good thing, but is that enough? How did so many bugs slip through a Test Plan? And what critical bugs remain that they decided not to fix?

  • What was the test plan before October 1?
  • Were the tests conducted and what bugs were known before October 1?
  • How did they decide to release Healthcare.gov with those bugs?
  • How many bugs were found after October 1 and how were they identified?
  • Is the current Test Plan adequate?
  • What bugs were allowed in the relaunched version?
  • How are known and new bugs being handled?

Software development never reaches perfection but a good test plan covers the expected extremes to ensure the features work, unexpected errors are gracefully trapped, the system is scalable to support the expected number of users, and the site is secure.

In our experience, buggy software inevitably creates and reveals more bugs as bugs are addressed. Known problems with transmitting data to the insurance companies were already acknowledged. This implies this final step of the process was poorly tested, probably because all the preceding steps were failing. This would indicate many unknown issues that still need to be found and fixed.

If the original developers didn’t know what they were doing, trying to fix their work could be a waste of time. An experienced development team may be able to create a better solution in less time than fixing shoddy design and code from unqualified personnel.

Hardware: Do they Have Development, Testing and Staging Platforms?
The only reason I can see for such low availability is the lack of proper development, testing and staging environments. When we create web sites, our software developers need their own hardware to create and test their work without disrupting the production system. Testers need a separate platform to do their work and report back to the developers about the problems they encounter. And a staging site is necessary to review what’s about to be deployed. When the decision is made to release the new version, a switch can be made to make the staging site the new production one. In a modern host, the switch can be done almost instantaneously. Maybe it’s down for a short period to verify the new site is working, but it’s not down for extensive testing because the testing and staging environments already handle that.

Based on the information before the October 1 debut, it was clear that the standard software environment of development, testing, staging and production did not exist. How the managers of the project could have neglected this fundamental part of software development is beyond me, especially for the amount of money spent to build this site.

Without the proper platforms, it indicates the people didn’t even consider how they’d enhance and maintain the system over time, and further supports my contention that the people who created and managed this website had never been paid to build commercial database web sites before.

It really is software malpractice to not have the proper development, testing, staging and production platforms in place. The contractors should be liable for such neglect and reimburse the taxpayers.

Why Wasn’t the Site Redesigned for Simplicity, Performance, Scalability and Security?
There were many opportunities to redesign the site to make it more consumer friendly, reduce the amount of development and testing resources, support more users, and improve security. I list these missed opportunities based on what I have seen:

  • The account creation page should be one screen not three. We create multiple pages if entries in one screen impact the following screens. For the Healthcare.gov site, that’s not the case. For instance, if on the first page, you enter an email address that already exists in the system, you’re not told it’s invalid until you finish the third page and are forced to restart. That just adds load on the system. There’s also no need to create a different user name. Why not just use the email address? Most web sites have a one page account creation page, but we understand how having more pages is more profitable to the contractor.
  • The See Plans feature is a huge improvement for shopping. However, when someone finds and wants to buy a plan without a subsidy, there isn’t a way to do so without creating an account in the system. The site should simply direct the customer to the insurance company since the government is not involved with providing a subsidy. In addition to improving the customer experience, that would reduce the load on the Healthcare.gov web site so they can serve more customers. Get them off the site as quickly as possible!
  • There’s no need to ask for information that isn’t directly tied to calculating the subsidy. The “nice to have” questions on race can be discarded to improve response time, reduce the time it takes users to fill out the application form, and increase the number of users the site can support. It also increases capacity.

Conclusions

Over the years, we’ve helped lots of organizations design their software solutions, select technologies, specify architectures, and deliver solutions that are reliable, scalable, secure and maintainable. So much of the Healthcare.gov site seems to remain quite fragile.

I don’t mean to slam the many people worked hard to salvage the awful work of the initial developers. I’m sure they didn’t get to spend much time with their families over Thanksgiving. The relaunched site is definitely much better than the original version. But it only looks good when compared to that technical disaster. Can anyone claim the new metrics are acceptable for an enterprise quality, nationwide public site as important as this?

For more information, read my earlier blog post Too Big to Fire: How Government Contractors on Healthcare.gov Maximize Profits, and a newer post Designing a Data Entry System Properly; Overhauling the Healthcare.gov Web Site.