Sift Science Uses Machine Learning to Weed Out Credit Card Fraud

8/22/13Follow @wroush

Google has team of 120 engineers dedicated solely to fighting fraud and webspam. Its leader, Matt Cutts, is probably the most familiar face at the company, after founders Larry Page and Sergey Brin and chairman Eric Schmidt.

Up in Seattle, Amazon has more than twice as many full-time fraud-fighters. And when you consider that online fraud costs businesses an estimated $3.4 billion in credit-card chargebacks every year, it’s easy to see why these two Internet giants would invest so much in detecting and evading bad guys. But what about the “little guys”—the online businesses like Uber or Airbnb that process millions of dollars in online payments every year, but aren’t big enough to hire whole teams of engineers to fight fraud? Shouldn’t they have access to state-of-the-art fraud detection technologies too?

That’s exactly the market Sift Science wants to serve. The San Francisco startup, which emerged from the Y Combinator accelerator in 2011, boasts a team of eight ex-Google engineers building a cloud-based system that monitors other companies’ e-commerce systems in real time. If Sift’s machine-learning algorithms spot a credit-card purchase that looks suspicious, it gets flagged for review by a human. The idea is to disallow the highest-risk transactions and reduce costly chargebacks. (Merchants, not consumers or credit card companies, are usually the ones on the hook when fraudsters make purchases using stolen credit cards.)

“It’s amazing, the similarities between Matt Cutts’ job and what we are trying to accomplish,” says Jason Tan, Sift Science’s co-founder and CEO (pictured above). “Mom-and-pop and mid-tier e-commerce businesses don’t have the resources to hire 120 engineers, and they don’t have the cutting-edge technology that Google does. Our intuition was, why can’t they hire a third party to abstract all of that away [in the cloud] and make it easy for them to sign up and get protected.”

A map of global credit-card fraud rates, based on data collected by Sift Science. The countries generating the most fraudulent transactions: Latvia, Egypt, the United States, Mexico, Ukraine, Hungary, Malaysia, Colombia, Romania, and the Philippines. (Image courtesy of Sift Science.)

A map of global credit-card fraud rates, based on data collected by Sift Science. The countries generating the most fraudulent transactions: Latvia, Egypt, the United States, Mexico, Ukraine, Hungary, Malaysia, Colombia, Romania, and the Philippines.

Tan and his co-founder, Brandon Ballinger, met as roommates and fellow computer science majors at the University of Washington in Seattle, where they graduated in 2006. Ballinger went on to Google, while Tan worked for a series of Seattle tech startups, including Zillow, Optify, and BuzzLabs. In 2011, Ballinger got into Y Combinator with an idea for a mobile-social-local app; he was able talk Tan into joining him, but only on the condition that they try something more challenging.

Tan and Ballinger (who has since left the company) got interested in the fact that most merchants and credit-card companies still rely on crude rules-based systems to screen transactions. To minimize chargebacks, a company might, for example, look at past episodes of fraud and set up a rule saying “If the transaction is for more than $10,000 and the IP address of the purchaser is in Nigeria, flag it as suspicious.”

But the next fraudulent transaction might originate in China, necessitating a new rule. And the anti-Nigeria rule might mistakenly filter out legitimate orders. “So you have this big mess of hundreds of rules, but it’s very static,” not to mention porous and unreliable, Tan says.

The obvious way to improve fraud detection, Tan and Ballinger thought, would be to ditch all the manually constructed rules and use machine-learning algorithms to identify the real patterns that foreshadow fraud at specific e-commerce sites. “With machine learning you can teach the computers to build the rules themselves using the statistics as data,” Tan says.

The procedure wouldn’t be all that different from the speech-recognition work Ballinger had been doing using machine learning at Google, or the sentiment analysis work Tan had been doing at BuzzLabs; what had changed was that it was getting cheaper to analyze large number of transactions in the cloud, using distributed database technology. “Only now, with Hbase and Hadoop, has the technology evolved to the point that we can build this kind of infrastructure outside of Google,” says Tan.

About a year after graduating from Y Combinator, Sift Science secured Series A funding from Union Square Ventures and First Round Capital, bringing its total investment to … Next Page »

Wade Roush is a contributing editor at Xconomy. Follow @wroush

Single Page Currently on Page: 1 2

By posting a comment, you agree to our terms and conditions.