Building a Multi Lingual Multi Task Model in Finance Domain

You’re on the right path with multi-lingual, multi-task models in the finance domain. I’ve built Triskel Data a curated archive of high-value, structured legal and financial datasets like:

  • CourtListener (legal rulings)
  • SEC filings (fully extracted)
  • Federal Register (regulatory history)
  • AI patent datasets

All cleaned and tokenization-ready in .jsonl format — not raw scrapes.

A Developer Tier is available with limited access for serious users. While not free, it’s accessible enough to get started without the typical scraping or cleanup burden.

1 Like