Responsible AI-ready data

Source-linked public data before automation.

This prototype treats AI readiness as a governance and evidence problem first: public records, source URLs, staff review, logs, and plain-language explanations before any assisted metadata or search feature reaches residents.

Boundaries for this scope

Public metadata and public records only.
Human review before assisted tags, summaries, translations, or data stories are published.
Every answer and generated draft keeps a source URL.
No nonpublic data, no row-level sensitive-data AI, and no AI decision-making claims.

City source alignment

The portal can connect San Jose's open-data catalog, AI Inventory page, GovAI Coalition resources, and data-governance page without mixing public portal work with internal systems.

City AI Inventory GovAI Coalition Data Strategy and Governance

CKAN evidence

Public records that already support multilingual and AI-readiness work.

Gov AI Coalition

Agencies Translation Pairs

Agencies can upload their translation pairs in this dataset. Languages available: Amharic, Arabic, Armenian, Burmese, Chinese_simplified, Chinese_traditional, Farsi/Dari, Hmong, Korean, Russian, Somali, Spanish, Tagalog, Urdu, Vietnamese, File name format: [agencyName]\_[sourceLanguage]\_[targetLanguage] (all Lower ca...

Gov AI Coalition 24 datastore resources 29 total resources

Open Stoa page

Information Technology

SJ311 Language Translation AutoML Training Pairs

This dataset contains training pairs that were used for custom AutoML models for SJ311 Language Translations for dynamic content.

Information Technology 2 datastore resources 2 total resources

Open Stoa page