HAI Seminar with Erik Altman

Wed May 07 2025 at 12:00 pm to 01:15 pm UTC-07:00

Gates Computer Science Building Room 119 | Stanford

Stanford Institute for Human-Centered Artificial Intelligence (HAI)
Publisher/HostStanford Institute for Human-Centered Artificial Intelligence (HAI)
HAI Seminar with Erik Altman
Advertisement
Visiting scholars share their research with the HAI community.
About this Event

Synthetic Data Sets: Use Cases for the Financial Industry
HAI Seminar with Erik Altman


IBM Synthetic Data Sets (SDS) have been created for use cases in the financial industry.  One key focus is fraud and criminal activity, whose cost runs into the hundreds of billions of dollars per year or more.  SDS labels many of these criminal activities including money laundering, credit card fraud, check fraud, APP (Authorized Push Payment) fraud (scams), and insurance claims fraud.  As such SDS data provides an attractive foundation for training AI detection models.

Unlike much current activity around synthetic data generation, SDS is not built using large language models.  Instead SDS uses an agent-based virtual world approach.  A key advantage of the SDS design is that all labels are correct:  all fraud is labelled fraud, and only fraud is labelled fraud.  By contrast, much criminal activity is missed in the real world, including 95% of money laundering by a UN estimate.  Hence, even if real data is available, it is often of poor quality for training detection models, or for generating synthetic data.

In practice, access to real data is generally limited to a small number of people at the institution (e.g. a bank) that owns the data.  As such real data provides only a narrow view of activity at a single institution – as opposed to the global view provided by SDS data.  The SDS approach also yields a broad set of synthetic personal information.  This information is highly realistic despite using no information from real individuals.

Development of effective techniques for SDS has required deep expertise across diverse areas.  It has also required significant manual effort.  How to automate some of these efforts remains an open challenge, as do calibration, scaling, and other areas.


Details:

Time: 12:00 pm - 1:15 pm PT

Location: Gates Computer Science Building, Room 119, 353 Jane Stanford Way, CA 94503

Advertisement

Event Venue & Nearby Stays

Gates Computer Science Building Room 119, 353 Serra Mall, Stanford, United States

Tickets

USD 0.00

Sharing is Caring:

More Events in Stanford

Spring 2025 Symposium: Brain Resilience Research Showcase
Wed, 07 May, 2025 at 12:30 pm Spring 2025 Symposium: Brain Resilience Research Showcase

Li Ka Shing Conference Center (2nd floor), Paul Berg Hall

Long-Term Project(s) Collaborative Space
Wed, 07 May, 2025 at 05:00 pm Long-Term Project(s) Collaborative Space

408 Panama Mall

Third Coast Percussion at Stanford Bing Concert Hall
Wed, 07 May, 2025 at 05:30 pm Third Coast Percussion at Stanford Bing Concert Hall

Stanford Bing Concert Hall

Third Coast Percussion
Wed, 07 May, 2025 at 05:30 pm Third Coast Percussion

Bing Concert Hall - Stanford, CA

Grand Canyon Lopes at Stanford Cardinal Baseball
Thu, 08 May, 2025 at 04:00 pm Grand Canyon Lopes at Stanford Cardinal Baseball

Klein Field at Sunken Diamond

8th Annual Stanford Medicine Diversity and Inclusion Forum
Fri, 09 May, 2025 at 07:30 am 8th Annual Stanford Medicine Diversity and Inclusion Forum

Berg Hall, Li Ka Shing Learning and Knowledge Center

Stanford is Happening!

Never miss your favorite happenings again!

Explore Stanford Events