Author Archives: kevin

About kevin

I write the posts

Virgin Mobile fails web security 101, leaves six million subscriber accounts wide open

Update: Virgin fixed the issue Tuesday night after taking their login page down for four hours. Please see my update at the bottom of this post.

The first sentence of Virgin Mobile USA’s privacy policy announces that “We [Virgin] are strongly committed to protecting the privacy of our customers and visitors to our websites at www.virginmobileusa.com.” Imagine my surprise to find that pretty much anyone can log into your Virgin Mobile account and wreak havoc, as long as they know your phone number.

I reported the issue to Virgin Mobile USA a month ago and they have not taken any action, nor informed me of any concrete steps to fix the problem, so I am disclosing this issue publicly.

The vulnerability

Virgin Mobile forces you to use your phone number as your username, and a 6-digit number as your password. This means that there are only one million possible passwords you can choose.

Screenshot of Virgin Mobile login screen

This is horribly insecure. Compare a 6-digit number with a randomly generated 8-letter password containing uppercase letters, lowercase letters, and digits – the latter has 218,340,105,584,896 possible combinations. It is trivial to write a program that checks all million possible password combinations, easily determining anyone’s PIN inside of one day. I verified this by writing a script to “brute force” the PIN number of my own account.

The scope

Once an attacker has your PIN, they can take the following actions on your behalf:

  • Read your call and SMS logs, to see who’s been calling you and who you’ve been calling

  • Change the handset associated with an account, and start receiving calls/SMS that are meant for you. They don’t even need to know what phone you’re using now. Possible scenarios: $5/minute long distance calls to Bulgaria, texts to or from lovers or rivals, “Mom I lost my wallet on the bus, can you wire me some money?”

  • Purchase a new handset using the credit card you have on file, which may result in $650 or more being charged to your card

  • Change your PIN to lock you out of your account

  • Change the email address associated with your account (which only texts your current phone, instead of sending an email to the old address)

  • Change your mailing address

  • Make your life a living hell

How to protect yourself

There is currently no way to protect yourself from this attack. Changing your PIN doesn’t work, because the new one would be just as guessable as your current PIN. If you are one of the six million Virgin subscribers, you are at the whim of anyone who doesn’t like you. For the moment I suggest vigilance, deleting any credit cards you have stored with Virgin, and considering switching to another carrier.

What Virgin should do to fix the issue

There are a number of steps Virgin could take to resolve the immediate, gaping security issue. Here are a few:

  • Allow people to set more complex passwords, involving letters, digits, and symbols.

  • Freezing your account after 5 failed password attempts, and requiring you to identify more personal information before unfreezing the account.

  • Requiring both your PIN, and access to your handset, to log in. This is known as two-step verification.

In addition, there are a number of best practices Virgin should implement to protect against bad behavior, even if someone knows your PIN:

  • Provide the same error message when someone tries to authenticate with an invalid phone number, as when they try to authenticate with a good phone number but an invalid PIN. Based on the response to the login, I can determine whether your number is a Virgin number or not, making it easy to find targets for this attack.

  • Any time an email or mailing address is changed, send a mail to the old address informing them of the change, with a message like “If you did not request this change, contact our help team.”

  • Require a user to enter their current ESN, or provide information in addition to their password, before changing the handset associated with an account.

  • Add a page to their website explaining their policy for responsible security disclosure, along with a contact email address for security issues.

History of my communication with Virgin Mobile

I tried to reach out to Virgin and tell them about the issue before disclosing it publicly. Here is a history of my correspondence with them.

  • August 15 – Reach out on Twitter to ask if there is any other way to secure my account. The customer rep does not fully understand the problem.

  • August 16 – Brute force access to my own account, validating the attack vector.

  • August 15-17 – Reach out to various customer support representatives, asking if there is any way to secure accounts besides the 6-digit PIN. Mostly confused support reps tell me there is no other way to secure my account. I am asked to always include my phone number and PIN in replies to Virgin.

    Email screenshot of Virgin asking me to include my PIN

  • August 17 – Support rep Vanessa H escalates the issue to headquarters after I explain I’ve found a large vulnerability in Virgin’s online account security. Steven from Sprint Executive and Regulatory Services gives me his phone number and asks me to call.

  • August 17 – I call Steven and explain the issue, who can see the problem and promises to forward the issue on to the right team, but will not promise any more than that. I ask to be kept in the loop as Virgin makes progress investigating the issue. In a followup email I provide a list of actions Virgin could take to mitigate the issue, mirroring the list above.

  • August 24 – Follow up with Steven, asking if any progress has been made. No response.

  • August 30 – Email Steven again. Steven writes that my feedback “has been shared with the appropriate managerial staff” and “the matter is being looked into”.

  • September 4 – I email Steven again explaining that this response is unacceptable, considering this attack may be in use already in the wild. I tell him I am going to disclose the issue publicly and receive no response.

  • September 13 – I follow up with Steven again, informing him that I am going to publish details of the attack in 24 hours, unless I have more concrete information about Virgin’s plans to resolve the issue in a timely fashion.

  • September 14 – Steven calls back to tell me to expect no further action on Virgin Mobile’s end. Time to go public.

Update, Monday night

  • Sprint PR has been emailing reporters telling them that Sprint/Virgin have fixed the issue by locking people out after 4 failed attempts. However, the fix relies on cookies in the user’s browser. This is like Virgin asking me to tell them how many times I’ve failed to log in before, and using that information to lock me out. They are still vulnerable to an attack from anyone who does not use the same cookies with each request. (ed: This issue has been fixed as of Tuesday night)

  • News coverage:

  • This vulnerability only affects Virgin USA, to my knowledge; their other international organizations appear to only share the brand name, not the same code base.

Update, Tuesday night

Virgin’s login page was down for four hours from around 5:30 PDT to 9:30 PDT. I tried my brute force script again after the page came back up. Where before I was getting 200 OK’s with every request, now about 25% of the authentication requests return 503 Service Unavailable, and 25% return 404 Not Found.

Wednesday morning

Virgin took down their login page for 4 hours Tuesday night to deploy new code. Now, after about 20 incorrect logins from one IP address, every further request to their servers returns 404 Not Found. This fixes the main vulnerability I disclosed Monday.

I just got off the phone with Sprint PR representatives. They apologized and blamed a breakdown in the escalation process. I made the case that this is why they need a dedicated page for reporting security and privacy issues, and an email address where security researchers can report problems like this, and know that they will be heard.

I gave the example of Google, who says “customer service doesn’t scale” for many products, but will respond to any security issue sent to security@google.com in a timely fashion, and in many cases award cash bounties to people who find issues. Sprint said they’d look into adding a page to their site.

Even though they’ve fixed the brute force issue, I raised issues with PIN based authentication. No matter how many automated fraud checks they have in place, PIN’s for passwords are a bad idea because:

  • people can’t use their usual password, so they might try something more obvious like their birthday, to remember it.

  • Virgin’s customer service teams ask for it in emails and over the phone, so if an attacker gains access to someone’s email, or is within earshot of someone on a call to customer service, they have the PIN right there.

  • If I get access to your PIN through any means, I can do all of the stuff mentioned above – change your handset, read your call logs, etc. That’s not good and it’s why even though Google etc. allow super complex passwords, they allow users to back it up with another form of verification.

I also said that they should clarify their policy around indemnification. I never actually brute forced an account where I didn’t know the pin, or issue more than one request per second to Virgin’s servers, because I was worried about being arrested or sued for DOSing their website. Fortunately I could prove this particular flaw was a problem by dealing only with my own account. But what if I found an attack where I could change a number in a URL, and access someone else’s account? By definition, to prove the bug exists I’d have to break their terms of service, and there’s no way to know how they would respond.

They said they valued my feedback but couldn’t commit to anything, or tell me about whether they can fix this in the future. At least they listened and will maybe fix it, which is about as good as you can hope for.

Liked what you read? I am available for hire.

Reddit’s database has two tables

Steve Huffman talks about Reddit’s approach to data storage in a High Scalability post from 2010. I was surprised to learn that they only have two tables in their database.

Lesson: Don’t worry about the schema.

[Reddit] used to spend a lot of time worrying about the database, keeping everthing nice and normalized. You shouldn’t have to worry about the database. Schema updates are very slow when you get bigger. Adding a column to 10 million rows takes locks and doesn’t work. They used replication for backup and for scaling. Schema updates and maintaining replication is a pain. They would have to restart replication and could go a day without backups. Deployments are a pain because you have to orchestrate how new software and new database upgrades happen together.

Instead, they keep a Thing Table and a Data Table. Everything in Reddit is a Thing: users, links, comments, subreddits, awards, etc. Things keep common attribute like up/down votes, a type, and creation date. The Data table has three columns: thing id, key, value. There’s a row for every attribute. There’s a row for title, url, author, spam votes, etc. When they add new features they didn’t have to worry about the database anymore. They didn’t have to add new tables for new things or worry about upgrades. Easier for development, deployment, maintenance.

The price is you can’t use cool relational features. There are no joins in the database and you must manually enforce consistency. No joins means it’s really easy to distribute data to different machines. You don’t have to worry about foreign keys are doing joins or how to split the data up. Worked out really well. Worries of using a relational database are a thing of the past.

This fits with a piece I read the other day about how MongoDB has high adoption for small projects because it lets you just start storing things, without worrying about what the schema or indexes need to be. Reddit’s approach lets them easily add more data to existing objects, without the pain of schema updates or database pivots. Of course, your mileage is going to vary, and you should think closely about your data model and what relationships you need.

Update, 10:05AM PDT: It’s worth reading the comments from a current Reddit engineer on this post. Particularly this one:

I’m personally not a fan of using an RDBMS as a key-value store – but take a look at, say, line 60 of the accounts code. Each item in that _defaults dictionary corresponds to an attribute on an account. For pretty much all of those (1) we don’t need to join on it and (2) we don’t want to do database maintenance just to add a new preference toggle. Those points are particularly more important when you’ve got a staff of 2-3 engineers. Sure, reddit has more now – but we’ve also now got a lot of data to migrate if we wanted to change, a lot of code to rewrite, and a lot of more important problems.

The data architecture made sense for Reddit as a small company that had to optimize for engineering man hours. Now they are much bigger and can afford a saner structure. He/she mentions that they are in the process of migrating their Postgres data over to Cassandra, but slowly.

Update, 11:31PM PDT: A former engineer at reddit adds this comment.

There isn’t a “table” for a subreddit. There is a thing/data pair that stores metadata about a subreddit, and there is a thing/data pair for storing links. One of the properties of a link is the subreddit that it is in. Same with the comments. There is one thing/data pair for comments and the subreddit it is in is a property.

Still today I tell people that even if you want to do key/value, postgres is faster than any NoSQL product currently available for doing key/value.

Update, 7:11PM PDT: From Hacker News, it looks like they use two tables for each “thing”, so a thing/data pair for accounts, a thing/data pair for links, etc.

Liked what you read? I am available for hire.

Bash user? Try Zsh, the more usable terminal shell

On most operating systems, the default command line shell is Bash. Bash is a perfectly good shell. However there are a number of tasks that are slow or annoyingly time-consuming in Bash, particularly relating to the shell history. It's like using your phone to complete tasks, instead of a laptop.

Zsh is a newer terminal shell with slightly different syntax than Bash. Zsh makes many smart usability decisions where Bash fails, helping you get work done faster. I thought I'd highlight some of the best examples.

ZSH skips repeat commands in history. Let's say I run our unit test suite 20 times in a row. With Bash, if I wanted to get the command I ran just before that, I'd have to hit the up arrow 21 times to get the previous line. Zsh combines all the duplicate commands into one history item, so you only have to press 'up' twice to get that old command.

Ctrl+U deletes the whole line, no matter where the cursor is. In Bash Ctrl+U will delete everything left of the cursor. I have never wanted to delete everything left of the cursor without also deleting the whole line.

Command sensitive history. In Zsh if I type git and then press 'up', Zsh will cycle through all of my latest git commands, skipping any non-git commands. This is especially useful with a few infrequent commands I run that take many command line options, such as our configuration scripts.

Shared history across tabs. With Bash, using the 'up' arrow to access previous commands only gives you access to the history of the current tab. This is like having Chrome only remember your history for one given tab, instead of sharing it across all your tabs.

Smarter tab completion out of the box. Zsh is smart about figuring out which filenames you actually want. There's a good list of tab completion examples here, the two that stand out for me are:

  • Typing

    rm foo*<tab> 
    

    will expand to

    rm foo.txt foo.txt.swp foo.o foo.c
    

    or however many files beginning with foo there are in the directory.

  • Typing

    vim zzz<tab>
    

    will match in the middle of files, for example blah-blah-blah-zzz.txt.

You should give it a try - it may be a little unfamiliar at first, but you'll save a lot of time and annoyance by completing tasks more quickly with Zsh. You can switch by typing at the Bash command prompt:

$ zsh

That will load Z Shell in the current window. You can change your shell permanently by running:

$ chsh -s $(which zsh)

Let me know what you think!

Liked what you read? I am available for hire.

Why I’m not switching my bank to Simple

I finally got my Simple invite. Simple is an online banking startup that wants to make the experience of using banks much, much better. As far as I can tell they are delivering on that experience - their site and iPhone app are a joy to use. They also focus on the whole experience - the ATM card that came in the mail was delivered in a beautiful package, and the signup process, which requires gathering a ton of private information about you, was painless.

That said, I realized shortly after signing up that I'm going to be sticking with Ally Bank for one simple reason: they refund ATM fees. I used to hate ATM's. The option I had with ATM's was either a) look up the nearest in-network ATM online, then walk three blocks, all the while wondering why I are doing this to save a measly $3; or b) go to the liquor store ATM and hate myself for spending money to get at my money.

ATM fee refunds turned an experience I hated into something I can brag about ("It doesn't matter, I get the money back at the end of the month"). I get about $120 in ATM refunds every year, as well as the time and stress savings trying to find an in-network ATM saved, easily worth two or three percentage points of interest. I don't know how it makes business sense for Ally to pay for my ATM fees, but it's their killer feature.

I appreciate what Simple is doing and wish them the best. If you are still saddled with a card from a bank with physical branches, switching to Simple is one of the best things you could do. However at the moment Ally's ATM fee refunds (a better ATM user experience) over Simple's awesome website and iPhone app (a better web experience).

Liked what you read? I am available for hire.

Why we’ll never shut down our API

Recently there have been several articles by startups who built a service around another company's API, and then got upset that the other company changed their Terms of Service, revoked their API access, or sent them a cease and desist letter. Others have rightly pointed out that Craigslist, Netflix, &c. offer API's to improve the value of their own services, and building a company around access to another's API may be foolhardy.

There's an obvious exception; you rarely hear about terms-of-service or cease-and-desist shenanigans from companies with APIs at their core. Twilio (where I work) is a good example. The amount of money we make is directly correlated to the number of API requests we receive. We have no reason to shut anyone off, because we make more money when people use our API more. Google shut off its Translate API recently, despite a large volume of legitimate use, because it interfered with search quality - spammers were using it to generate thin copies of good content. We would never turn off our API, because it's core to our business.

(One caveat: if your big plan is to use Twilio to spam people, then yeah, we're going to shut you off. We make money when people spend money with us, but we do have an Acceptable Use Policy.)

The only conceivable situation we'd shut down our API is if the company was forced to shut down. That's a risk, but we believe that risk is low and we're working every day to minimize it. So we're not going to shut off your access, unlike Linkedin or Craigslist or another company whose API is a convenience.

Liked what you read? I am available for hire.

Sentences to consider

There is a myth among neoliberal economists that labor markets have always “adjusted” sua sponte: that when laborers were displaced from farms, “higher value” factories arose to employ them; that when the factories were downsized and offshored, a more pleasant, higher-value service economy came to be; etc. That narrative is wrong, he told me. At best it is criminally incomplete. With each technological change, new social institutions had to arise to sustain dispersed purchasing power despite a reduction of numbers and bargaining power of workers in old industries. Displaced workers ultimately did find new work, but only because the new social institutions “artificially” created buyers for all the things displaced workers reinvented themselves to sell. Without this institutional innovation, Tyrone tells me, something like the Great Depression would have been the new normal.
That's from Randy Waldman at Interfluidity.

Liked what you read? I am available for hire.

What will happen to house prices in the Bay Area after the Facebook IPO?

A lot of people seem to think that house prices in the Bay Area will rise significantly after the Facebook IPO. It's fun to speculate about, but those people seem to be assuming a lot. Here are some of those assumptions I'm not so sure about:

  1. Facebook will add a significant number of new millionaires to the Bay Area.

  2. Everyone that benefits from the Facebook IPO will want to invest that money in a house, instead of into the stock market, retirement, cars, etc.

  3. Everyone that wants to invest their option cash in a house will buy shortly after they liquidate their options.

  4. Most of Facebook's new millionaires will buy houses in the Bay Area.

  5. The supply of housing for multi-millionaires is inelastic.

  6. The people buying new houses will not be vacating their old ones.

I'm not so sure that those assumptions are good ones. A San Francisco Chronicle article from 2009 said that there were 136,000 millionaires in the Bay Area in 2009, a number that has surely risen since then. Facebook has 3500+ employees, and on the high side I would guess maybe 1500 are going to earn enough money from the IPO to change their lifestyle, This would add about 1% to the Bay's total, assuming all of Facebook's newly minted millionaires live in the Bay Area.

If anything the prices of houses at the very high end (10 million plus) will rise. But it's hard to feel very sorry for people that are priced out of that market. If anything a housing shortage here may help remove or loosen some of the Bay's many restrictive housing and zoning laws.

Liked what you read? I am available for hire.

Virtualenv is an anti-pattern (for beginners)

Every time I do a user test with a beginning programmer, I remember how hard computers are, how unforgiving the tools are, and end up wanting to apologize for how annoying and strict programming is. We are making progress with teaching people how to code, but it's still really hard.

For example, if you are just getting started with Python, here's a short list of problems you might face when trying to set up Flask, which is by far the easiest Python web server to set up.

  • Learning how to cd in the terminal
  • How URL's requested by a user map to actual code
  • HTML, CSS and Javascript, because you actually want it to be pretty.
  • How to read and write things from a database
  • Installing Flask, so learning how to use pip or easy_install
  • Python telling you your file is no good because it mixes tabs and spaces.
  • How to run Flask locally

How to draw an owl: 1. Draw some circles 2. Draw the rest of the fucking owl

And that's not even counting the stuff that's so obvious to us we forget to mention it. Most quickstart guides also fail to help people make incremental progress.

Game designers are great at teaching new, hard things. They have to be, or no one will play their games. You will notice that games don't start with you battling Ganon in an epic death match; they start with you learning how to use the character and perform actions like make a kick, or open a door. Through a series of incremental successes you become an expert in the game and can tackle more and more complex tasks.

It bugs me to see so many Python tutorials mention virtualenv as a requirement to get started. (virtualenv is a tool for sandboxing your Python apps, so each Python project on your computer is using its own set of packages). The biggest advantage of virtualenv is that you can have different versions of the same Python package (like Flask or requests) that are required by different projects, whereas if you install them system-wide you can't.

However, recommending virtualenv just adds another thing you have to do before you can see pretty lights on the screen, and represents another possible opportunity for people to lose interest, and start doing something else instead of learning how to get a web server set up.

It also introduces a significant opportunity for confusion; the "It was working yesterday, why isn't it working now?" problem. You need to remember to source your virtualenv file in every Terminal shell where you're running Python, or your terminal will tell you it can't find the library you literally just installed. Needless to say this is confusing, the Terminal won't tell you how to solve the problem, and Googling for the answer isn't likely to give you the solution you need, because it's such a generic error message.

I've never seen beginners run into the problem of needing conflicting versions of a Python package for two different projects. I was comfortable dumping everything into site-packages for over two years of Python development; only when I started working at Twilio did I need to start installing virtualenvs for every project.

As a community, I believe we should stop recommending that beginners install virtualenv. The faster we can get beginners to a Holy Shit, I Wrote Code That Made Something Happen moment, the better, and virtualenv is a big block for getting to that point. Instead I'd recommend installing pip using the one line curl program in the second paragraph here. virtualenv is something that's more appropriate to learn about and use once you have a few Python projects under your belt.

Liked what you read? I am available for hire.

#1 on HN for Six Hours: Postmortem

On Saturday my post on how not to ask questions at a conference was the number one post on the site for a solid six hours, between four and ten PM. Here are some raw stats from the last day.

  • Since the post was submitted, I've gotten 31,787 pageviews to my site; 14,478 in the nine hours between post submission and midnight, and another 15k on Sunday. One post can bring in amazing amounts of traffic, and justify all of the effort you've put into creating quality blog content.

  • In just the last two days I've gotten 50% as much traffic as I did in the whole previous year.

Google Analytics Traffic
Can you tell which days I made the frontpage of Hacker News?

  • Of those pageviews, 30,807 were for the article itself. 414 people visited my homepage (1 in 72 people) and 253 people visited my about page (1 in every 130).

  • 8,142 visits (roughly 27%) came from mobile devices. I am really glad I added a mobile/responsive view for smaller screens earlier this year, as this makes the content much more consumable on a small screen.

  • 69% of mobile visits (18% of the total) came from an iOS device.

  • Roughly 10,000 clicks came from Hacker News and 8,700 came from the Programming subreddit, where my post is still on the frontpage a day and a half later. If your post is doing well on HN, it probably makes sense to submit it to Proggit as well, as there's a large contingent of people that use Proggit exclusively.

  • 1,479 people have clicked on the aggregate Bit.ly link and 97 people have Tweeted the post (roughly 1 in 300).

  • I added 18 Twitter followers (about 1 in every 1800 visitors), bumping my total to 418. I added one new Bitbucket follower and zero new newsletter subscribers.

  • 23 people left comments on the post (about 1 in every 1300). 156 people left comments on Hacker News, off about 10k clicks, and ~160 people left comments on Reddit, off of 9k clicks.

I've posted my "conversion rate" in all cases because I don't feel like it's amazingly high. This is probably the nature of this sort of traffic though; there to read an article and learn something and then move on to the next thing. I suppose if I can reach the frontpage a few times in short succession, people may start to recognize my name and there would be a snowball effect, in terms of the number of people signing up to follow me or posting comments.

I don't have the tools in place at the moment to be able to test my "conversion rates" and see whether they can be improved. All in all though, the low rates at which people are clicking through to other material on my site suggests that I should put any information you'd like readers to know about yourself on the post view page, or in the footer of the post itself.

Liked what you read? I am available for hire.

“The best recommendations have a lot of verbs”

Via Tyler Cowen, an author from the Wall Street Journal interviewed the head of admissions at the Harvard Business School. The whole article is good, but this particular line stood out:

The best recommendations have a lot of verbs. They say, "She did this," versus adjectives that simply describe you.

I remember in 5th grade that we had to write Show Not Tell stories. The idea was to get out of the habit of writing "Kyle is in 3rd grade and he is really kind" - style stories and instead writing "When Joey's mom couldn't pick him up, Kyle walked him all the way home, even though it was two miles in the wrong direction."

I don't know why later teachers dropped the Show Not Tell agenda from the curriculum, but apparently people still write in this style. Maybe recommenders are lazy and it's easier to write "Shannon is a hard worker" than it is to come up with a concrete example. Maybe the recommender doesn't know the student very well, which is discussed in the article, and a problem.

The other possibility is that the person being recommended hasn't done anything interesting. It's easy to tell, because if you have done things, people tend to mention them when they're introducing you to someone, like "This is Jeff. Jeff wrote the entire billing system." You can also tell because the bullet points on your resume will have really bland verbs in them that don't really say anything, like "Developed marketing skills" or "Monitored social analytics tools for Company X".

One day you are going to have to wake up and decide to be Someone that Does Things. The World of Doing can be scary at first because there are lots of things that need to be done and no one is there to tell you how to do them.

So: what verbs to people use to describe you? Which of the versions below would you rather someone used to describe you?

  • "He's really good at finding tips and tricks to save time."

  • "He built a replacement for the school's calendar system and got 550 people to sign up."


  • "She is a really fierce competitor."
  • "She won the regional finals for her team by making free throws and getting a key steal in the final minutes."


  • "She's a hard worker."
  • "She rewrote the website to make it 100% faster, which boosted signups by 50%."


  • "His code is always reliable."
  • "While he was in charge, the API was down for a total of three minutes in two years."

Related: See Be Specific at Less Wrong.

Liked what you read? I am available for hire.