BEGIN:VCALENDAR VERSION:2.0 PRODID:-//132.216.98.100//NONSGML kigkonsult.se iCalcreator 2.20.4// BEGIN:VEVENT UID:20251120T110349EST-82014RgrOk@132.216.98.100 DTSTAMP:20251120T160349Z DESCRIPTION:Heng Zhang\n\nW.P. Carey Business School\n Arizona State Univers ity\n\nLarge Language Models for Market Research: A Data-augmentation Appr oach\n\nDate: Friday\, November 21\, 2025\n Time: 11:00 AM - 12:00 PM\n Loca tion: Bronfman Building\, Room 045\n\n\nAbstract\n\nLarge Language Models (LLMs) have transformed artificial intelligence by excelling in complex na tural language processing tasks. Their ability to generate human-like text has opened new possibilities for market research\, particularly in conjoi nt analysis\, where understanding consumer preferences is essential but of ten resource-intensive. Traditional survey-based methods face limitations in scalability and cost\, making LLM-generated data a promising alternativ e. However\, while LLMs have the potential to simulate real consumer behav ior\, recent studies highlight a significant gap between LLM-generated and human data\, with biases introduced when substituting between the two. In this paper\, we address this gap by proposing a novel statistical data au gmentation approach that efficiently integrates LLM-generated data with re al data in conjoint analysis. This results in statistically robust estimat ors with consistent and asymptotically normal properties\, in contrast to na\'ive approaches that simply substitute human data with LLM-generated da ta\, which can exacerbate bias. We further present a finite-sample perform ance bound on the estimation error. We validate our framework through an e mpirical study on COVID-19 vaccine preferences\, demonstrating its superio r ability to reduce estimation error and save data and costs by 24.9% to 7 9.8%. In contrast\, naive approaches fail to save data due to the inherent biases in LLM-generated data compared to human data. Another empirical st udy on sports car choices validates the robustness of our results. Our fin dings suggest that while LLM-generated data is not a direct substitute for human responses\, it can serve as a valuable complement when used within a robust statistical framework.\n DTSTART:20251121T160000Z DTEND:20251121T170000Z LOCATION:Room 045\, Donald E. Armstrong Building\, CA\, QC\, Montreal\, H3A 3L1\, 3420 rue McTavish SUMMARY:Management Science Research Centre (MSRC) Seminar: Heng Zhang URL:/dobson/channels/event/management-science-research -centre-msrc-seminar-heng-zhang-368977 END:VEVENT END:VCALENDAR