Ethan Mollick wants AI to play the role of a Victorian scientist who believes in luminous ether.

robot
Abstract generation in progress

Ethan Mollick proposes using synthetic data to simulate Victorian scientists searching for luminous ether

Summary

Core idea: Fill in historical gaps with synthetic data and then run counterfactual historical simulations. Ethan Mollick (Wharton School, researching how AI changes work) presented a concept on social media: to use AI-generated synthetic data to construct “missing historical records,” allowing models to simulate counterfactual historical scenarios. His example is having an AI agent portray a Victorian scientist still in pursuit of “luminous ether,” employing techniques reminiscent of Neal Stephenson’s weaving together of real history and fictional narratives.

The background issue is simple: some historical periods have sparse documentary records. If synthetic data can reasonably fill in these gaps, AI can simulate paths that did not occur but “could have occurred.” For the AI industry, this points to education and the humanities—fields where generative models are still seeking practical applications.

Analysis

In his book “Co-Intelligence,” Mollick positions AI as a collaborative tool. This new concept pushes this idea into the realms of historical and history of science modeling:

  • Fill in the gaps in Victorian-era historical records with synthetic data
  • Allow agents to “autonomously” advance scientific questions of the time under certain constraints
  • The goal is not to replicate historical facts but to restore the cognitive boundaries and reasoning methods of that era

Luminous ether is a compelling topic. The mainstream hypothesis of the 19th century posited that light needed an invisible medium to propagate through the universe. The 1887 Michelson-Morley experiment proved this medium did not exist, clearing the way for Einstein’s theory of relativity. If AI were to explore physical problems with the a priori assumption that “ether definitely exists” from before 1887, it could retrace the chain of scientific reasoning before the paradigm shift instead of applying modern answers in a “dimensionality reduction” critique.

Referencing Stephenson is not arbitrary. His “Baroque Cycle” integrates historical figures like Newton and Leibniz into fictional narratives surrounding science, cryptography, and finance. Mollick’s vision is similar: to have AI generate reasonable historical trajectories of events that “could have happened but did not,” and then examine how these trajectories differ from actual history.

The risks are also evident: synthetic data may lead to illusions and biases. If the historical context being “filled in” is itself fictional, how can one distinguish between useful inferences and misleading fabrications? Currently, there are no recognized validation and labeling standards, which is an obstacle that cannot be avoided before practical implementation.

Implications

  • Education and museums: Interactive courses or exhibitions could be developed to run counterfactual scenarios within a controlled scope, making learning more engaging
  • Policy think tanks: Build “what-if” sandboxes to examine historical policy crossroads and their costs
  • Industry opportunities: Tools centered around “synthetic data—context modeling—multi-agent simulation” could become the infrastructure for edtech and digital humanities
  • Risks and governance: AI outputs that appear authoritative, if not marked with their sources and confidence levels, may amplify false memories and information pollution. Strict validation processes and clear content generation identifiers are needed, and regulatory bodies may also intervene

Key Takeaways

  • Synthetic data makes counterfactual history feasible at the “material” level, especially in periods and fields with scarce historical records
  • The luminous ether case indicates that AI can retrace scientific reasoning within the cognitive frameworks “before paradigm shifts,” helping us understand the decision constraints faced by scientists at that time
  • Validation and labeling are the top challenges: defining “useful inferences” versus “misleading fabrications” still requires breakthroughs in methodology and tools

Related Developments

  • Researchers have been conducting counterfactual historical writing experiments using LLMs on platforms like Substack
  • The 2024 arXiv (arXiv:2407.13922) explores analyzing visual model biases using “synthetic counterfactual faces,” indicating the potential of synthetic data in extrapolation and fairness research
  • MIT Technology Review has reported on AI assisting historians in analyzing star catalogs and other historical documents

These trends align with Mollick’s ideas: using AI to expand our capacity to understand the past (and its potential branches).

Further Reading

  • Ethan Mollick: “Co-Intelligence: Living and Working with AI” (Penguin Portfolio, 2024)
  • Neal Stephenson: “The Baroque Cycle”
  • arXiv paper: “Synthetic Counterfactual Faces” (arXiv:2407.13922, 2024)
  • MIT Technology Review: “How AI is helping historians better understand our past” (April 2023)

Bottom line assessment: This is an “early, methodology-driven” narrative. It is currently more suitable for builders willing to refine data and evaluation pipelines—edtech/digital humanities tool developers and research institutions. Those looking for short-term monetization have little to do for now; long-term funding can be tracked with small positions, but the true advantage will still belong to the Builders who are establishing standards and products.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin