Skip to main content
GET
/
api
/
getTranscription
Poll the status and result of a transcription job
curl --request GET \
  --url https://app.xosum.am/api/getTranscription \
  --header 'Authorization: Bearer <token>'
{
  "docId": "<string>",
  "createdAt": "2023-11-07T05:31:56Z",
  "duration": 123,
  "metadata": {},
  "transcription": "<string>",
  "title": "<string>",
  "resolution": "<string>",
  "languageCode": "<string>",
  "error": "<string>"
}
Poll for the status and result of a transcription job by its docId. This is the alternative to receiving the result via your webhook — useful when you can’t host a publicly reachable receiver (for example, when your network requires IP whitelisting that the Xosum Cloud Functions infrastructure can’t provide).

Polling pattern

Most jobs finish in seconds (single_voice) to a few minutes (phone_call). Poll every 5–10 seconds until status becomes transcribed or failed, then stop.
curl -H "Authorization: Bearer $XOSUM_API_KEY" \
  "https://app.xosum.am/api/getTranscription?docId=$DOC_ID"
Python
import os, time, requests

API_KEY = os.environ["XOSUM_API_KEY"]
doc_id = "XYZ123"

while True:
    r = requests.get(
        "https://app.xosum.am/api/getTranscription",
        params={"docId": doc_id},
        headers={"Authorization": f"Bearer {API_KEY}"},
    )
    if r.status_code == 429:
        time.sleep(int(r.headers.get("Retry-After", "5")))
        continue
    r.raise_for_status()
    body = r.json()
    if body["status"] in ("transcribed", "failed"):
        print(body)
        break
    time.sleep(7)

Rate limit

  • 60 requests per minute per API key, fixed wall-clock-minute window — the counter resets at the top of each minute, not as a rolling window.
  • On a 429 response, the Retry-After header tells you how many seconds until the next window opens. Sleep for at least that long before retrying.
  • The recommended 5–10 second cadence per in-flight job stays well under the limit for a handful of parallel jobs.

Response field availability

The base fields (docId, status, createdAt, type, duration, metadata) are always returned. The fields below appear only once status reaches a terminal value:
When status isAdditional fields
transcribedtranscription, title, resolution, languageCode
failederror
The transcription field contains plain text when type is single_voice and diarized text with inline Խոսնակ 1: / Խոսնակ 2: speaker labels when type is phone_call.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Query Parameters

docId
string
required

The document ID returned by POST /api/createTranscription.

Response

Current status and (once complete) the transcript.

Polling response for a transcription job. The base fields are always returned; the additional fields below are populated only once status reaches a terminal value (transcribed or failed).

docId
string
required

The same ID returned by POST /api/createTranscription.

status
enum<string>
required

Current job state. Terminal states are transcribed and failed; clients should stop polling once one is reached.

Available options:
uploading,
processing,
converting,
transcribed,
failed
createdAt
string<date-time> | null

ISO-8601 timestamp of when the job was created.

type
enum<string> | null

Echoes the type passed when the job was created.

Available options:
single_voice,
phone_call
duration
number | null

Audio duration in seconds. Populated after the recording is processed.

metadata
object

The same metadata object passed to POST /api/createTranscription.

transcription
string | null

Present when status is transcribed. For type: phone_call, the text is diarized with inline Խոսնակ 1: / Խոսնակ 2: speaker labels. For type: single_voice, the text is plain with no labels.

title
string | null

AI-generated short title. Present when status is transcribed.

resolution
string | null

AI-generated one-paragraph summary. Present when status is transcribed.

languageCode
string | null

BCP-47 language code of the detected language, e.g. hy-AM. Present when status is transcribed.

error
string | null

Human-readable error message. Present when status is failed.