Get Transcription
Returns the current status and (once complete) the transcript for a docId returned by POST /api/createTranscription. Use this as an alternative to webhook delivery when hosting a public webhook receiver isn’t an option (for example, when the consumer is behind Cloudflare with IP-whitelisting requirements that the Xosum Cloud Functions infrastructure can’t satisfy).
Recommended polling cadence: every 5–10 seconds per in-flight job. Most single_voice jobs finish within seconds; phone_call jobs may take a few minutes.
Rate limit: 60 requests per minute per API key, fixed wall-clock-minute window. A 429 response includes a Retry-After header (seconds) indicating when the next window opens — sleep at least that long before retrying.
docId. This is the alternative to receiving the result via your webhook — useful when you can’t host a publicly reachable receiver (for example, when your network requires IP whitelisting that the Xosum Cloud Functions infrastructure can’t provide).
Polling pattern
Most jobs finish in seconds (single_voice) to a few minutes (phone_call). Poll every 5–10 seconds until status becomes transcribed or failed, then stop.
Rate limit
- 60 requests per minute per API key, fixed wall-clock-minute window — the counter resets at the top of each minute, not as a rolling window.
- On a
429response, theRetry-Afterheader tells you how many seconds until the next window opens. Sleep for at least that long before retrying. - The recommended 5–10 second cadence per in-flight job stays well under the limit for a handful of parallel jobs.
Response field availability
The base fields (docId, status, createdAt, type, duration, metadata) are always returned. The fields below appear only once status reaches a terminal value:
When status is | Additional fields |
|---|---|
transcribed | transcription, title, resolution, languageCode |
failed | error |
transcription field contains plain text when type is single_voice and diarized text with inline Խոսնակ 1: / Խոսնակ 2: speaker labels when type is phone_call.Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Query Parameters
The document ID returned by POST /api/createTranscription.
Response
Current status and (once complete) the transcript.
Polling response for a transcription job. The base fields are always returned; the additional fields below are populated only once status reaches a terminal value (transcribed or failed).
The same ID returned by POST /api/createTranscription.
Current job state. Terminal states are transcribed and failed; clients should stop polling once one is reached.
uploading, processing, converting, transcribed, failed ISO-8601 timestamp of when the job was created.
Echoes the type passed when the job was created.
single_voice, phone_call Audio duration in seconds. Populated after the recording is processed.
The same metadata object passed to POST /api/createTranscription.
Present when status is transcribed. For type: phone_call, the text is diarized with inline Խոսնակ 1: / Խոսնակ 2: speaker labels. For type: single_voice, the text is plain with no labels.
AI-generated short title. Present when status is transcribed.
AI-generated one-paragraph summary. Present when status is transcribed.
BCP-47 language code of the detected language, e.g. hy-AM. Present when status is transcribed.
Human-readable error message. Present when status is failed.
