ASIC prototype to nab online scams

The Australian Securities and Investment Commission will next month implement a prototype resulting from a software research project that has improved detection of online investment scams nearly ten-fold.

ASIC announced in February $1 million in funding for Scamseek, a joint project between the Capital Markets Cooperative Research Centre, the University of Sydney and Macquarie University, and industry partner SMARTS (Security Markets Automated Research Training and Surveillance), to develop a linguistic-based Internet Document Classification System.

According to Scamseek project director Professor Jon Patrick, from the University of Sydney's School of Information Technologies, ASIC's current Web search tool retrieves on average 80 possible scams, of which just one proves illegal after analysis.

Although only three months old, Patrick said Scamseek's 10 staff (nine full-time) have improved this ratio to better than one scam in 10 suspect documents.

"Our objective is to get to better than one scam in two documents, and we're very optimistic we'll get there," he said.

The director of ASIC's electronic enforcement unit, Keith Inman, said ASIC would implement the prototype within a fortnight.

"We had built a concept system some time ago which was a slight improvement on our system. But this Scamseek prototype is a significant improvement on our current methods [of online scam detection]," he said.

The project is currently limited to searching HTML Web sites for unlicensed investment advice and unlawful fundraising.

Patrick said the project's current phase is planned to finish on September 30. However, Inman said if the good results continued, the project would progress to chat sites.

"Our research indicates that people use a range of channels on the Internet for these scams. They use HTML sites, bulletin boards, and chat sites.

"Chat sites will be a different vocabulary, a live conversation, but Jon's [Patrick] methodology uses machine learning to profile new trends and improve the hit rate on topics," Inman said.

Inman said the success of Scamseek had reduced the need for an ASIC 'surf day' this year. On these days, 20 ASIC staff each trawl the Internet for four hours searching for scams. More than 1000 Web pages are viewed, but with only two likely results. This is usually done a few times a year, he said.

Using the Scamseek prototype would increase ASIC's efficiency, Inman said, as the time usually taken to find scams could be spent pursuing Scamseek's results.

Professor Patrick said the way the project worked was that "ASIC supplied us with 7500 documents fitting three classes. There are the scam, [suspect] scam-light and non-scam classes. The target scam class is about 1.8% of the sample."

The project team, which consists of linguists, computational linguists, and software engineers, were not told of the classes or allowed to see the documents before development, Patrick said.

"I can report that the linguists are having a very high success rate in identifying scams," he said.

"In some of the subsections of scams they're identifying 100 per cent of the scams correctly through hand-crafted methods."

The linguists were having "large success" with Nigerian e-mail scams, he said.

The software engineers were finding the challenge harder, however.

Due to the experimental nature of the project, the team chose the Python programming language to help in rewriting large amounts of code.

"Part of the problem is we're only finding 30% of the scams in the sample. There's some we're not seeing using our system.

"The challenge is can you reduce the workload [of analysing suspect documents], but how many scams are being sifted out in the process," he said.

The Scamseek project has involved three systems, he said. A metasearch engine, or 'Web Spider' searches HTML documents for possible scams. These documents are then analysed by the Statistical Information Retrieval System. The classifier used in this system is developed on a lab system before being exported.

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection
Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Steven Deare

Computerworld
Show Comments

Most Popular Reviews

Latest Articles

Resources

PCW Evaluation Team

Ben Ramsden

Sharp PN-40TC1 Huddle Board

Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

George Khoury

Sharp PN-40TC1 Huddle Board

The biggest perks for me would be that it comes with easy to use and comprehensive programs that make the collaboration process a whole lot more intuitive and organic

David Coyle

Brother PocketJet PJ-773 A4 Portable Thermal Printer

I rate the printer as a 5 out of 5 stars as it has been able to fit seamlessly into my busy and mobile lifestyle.

Kurt Hegetschweiler

Brother PocketJet PJ-773 A4 Portable Thermal Printer

It’s perfect for mobile workers. Just take it out — it’s small enough to sit anywhere — turn it on, load a sheet of paper, and start printing.

Featured Content

Product Launch Showcase

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?