The untapped potential of machine learning for detecting fraud
E-commerce fraud protection company Signifyd has recently signed up behavioral analytics expert Long-Ji Lin to fill the position of Chief Scientist.
“For advertisers, Lin perfected his model’s ability to predict which users would take each desired action, such as a clicking on an ad or purchasing from an advertiser. For e-commerce merchants, Lin is focused on innovations in modeling to verify if the user who made the purchase on their site is in fact the person authorized to do so, or if they are a fraudster,” the company explained.
Lin, who’s most known for his extensive work in the adtech industry and for being a pioneer in behavioral targeting, will now “provide much-needed predictive capabilities for an industry where orders are still reviewed manually.”
What better opportunity than this to sit down with him and enquire about his expectations?
Fraud prevention aided by machine learning
Machine learning is increasingly being introduced to help enterprise defenders fight attackers who are after information or money. E-commerce fraudsters fall into the latter category.
“Fraudsters are highly motivated to outsmart our system. To beat them with artificial intelligence, we have some big challenges,” Lin told Help Net Security.
“Currently, we have access to lots of information about suspect fraudsters, including their purchase activities, online browsing activities, social networks, and even street pictures of their neighborhood and fake identification they submit to get their orders approved. The real challenge is how we can make sense of this unstructured data and then make good approve/decline decisions for thousands of merchants in real-time.”
That’s because humans are good at handling unstructured information, but today’s machine learning technology is optimized to deal with mostly structured data.
“While we still rely on humans to understand the data and help make structure out of it, there are many helpful techniques we can use such as text analysis, fuzzy matching (of postal addresses, for example), and graph analysis,” he noted.
Another challenge resides in the fact that when the company’s model declines a transaction because of risk concerns, they don’t know whether the transaction was really fraudulent or if their system made an error and declined a perfectly good order.
“In order to improve our learning in this regard, we sometimes need to approve risky transactions,” he explains. And then the challenge becomes: “How do we decide which risky orders to approve, to maximize our learning, while not losing too much money to the fraudsters?”
Big data analytics, and how it impacts modern enterprise security architectures
Lin says that the industry is still in the early stages with big data analytics for solving fraud prevention problems.
But while it’s true that with big data technology companies today have access to much richer user information than a few years ago, if they do not have sufficient big data technology the information might just scatter into various database tables and never get fully combined and leveraged.
“One cannot just ‘plug’ big data technology into an existing system and expect it to work wonders,” Lin noted.
“We need to re-architect the entire system, including databases, real-time processing, offline analytics, user interface, and many other components, to leverage big data technology appropriately. For example, Spark is an open-source framework developed over the past several years for large-scale data processing, and many machine learning algorithms have been developed successfully on top of Spark.”
“Big data technology evolves rapidly, and we will need to keep up to take advantage of its many new advances. By taking advantage of this new technology, we can already conduct data analytics faster and deliver better results than ever before,” he concluded.