Skip to main content
CLI is best when you want to process files immediately. SDK is best when you want to put parsing inside your own program. Two paths, one goal. If you want the positioning summary first, return to the SoMark CLI & SDK overview.

1. Installation

SoMark has Python and JavaScript implementations. Install the package for the language you use. Both packages provide the SDK and the somark CLI command. As a baseline, Python requires 3.10+ and Node.js requires 18+.
# Python: install SDK and CLI
pip install somark

# JavaScript: install SDK and CLI
npm install somark-js
If the somark command cannot be found, the package is usually installed correctly but the command directory is not in your PATH. Python users can try python -m somark.cli.main --help; Node users can try npx somark-js --help first. Once it runs, you can clean up your global command path.
If you install both the Python and JS versions, the CLI uses whichever one was installed first. The later installation skips CLI setup when it sees that the command already exists. To see which implementation is active, run somark --help; the footer shows [PY] or [JS].

2. Authentication and configuration

Remote parsing and usage queries require an API key. Local PDF processing and SoMarkDown preview do not. You can save configuration through the CLI, or pass parameters directly when initializing the SDK.
If your project already uses .env, put SOMARK_API_KEY=sk-your-api-key there and load it with python-dotenv, dotenv, or your own startup script. SoMark reads the environment variable itself; it does not decide how your .env file is loaded. Do not commit .env to Git.
# Interactive login writes the API key to local config
somark login

# You can also set, read, and inspect config directly
somark config set api_key sk-your-api-key
somark config get api_key
somark config list

# For temporary use, environment variables are lighter
export SOMARK_API_KEY=sk-your-api-key
Entry points: somark login, somark config ..., or configuration through command flags and environment variables.Command flags / environment variables
CapabilityFlag / fieldDescription
API key--api-key, SOMARK_API_KEYUsed for remote parsing and usage queries.
Base URL--base-url, SOMARK_BASE_URLDefault: https://somark.tech/api/v1.
Timeout--timeout, SOMARK_TIMEOUTMeasured in seconds.
Retry countSOMARK_MAX_RETRIESDefault: 2.
Parse concurrencySOMARK_PARSE_MAX_CONCURRENCYMaximum concurrency for batch parsing. The default must remain 1.
Hide warnings--no-warningsHide SoMark warning output; does not change command results, exit codes, or the warnings field on SDK response objects.
SOMARK_PARSE_MAX_CONCURRENCY must stay at 1 by default. The official default parsing concurrency for all users is also 1; only users who have been explicitly approved for higher concurrency should set it to 2 or above. When the CLI detects 2 or above, it reports this through the warning channel. This is a local warning, not an API warning.
Config file fields
CapabilityFlag / fieldDescription
API keyapi_keyStored in ~/.somark/config.toml.
Base URLbase_urlStored in ~/.somark/config.toml.
TimeouttimeoutStored in ~/.somark/config.toml.
Retry countmax_retriesStored in ~/.somark/config.toml.
Priority: command flags > environment variables > config file > defaults.

3. Warnings

SoMark warnings fall into two categories: API warnings and local warnings. API warnings come from the top-level response field warnings: List[str], at the same level as code and message. Local warnings come from SDK or CLI runtime checks, such as a batch parsing concurrency setting above the default quota. The CLI shows SoMark warnings by default. Use the global --no-warnings option when you need quiet output:
CLI
somark --no-warnings parse ./document.pdf
--no-warnings only hides warning display. It does not change command results, exit codes, or remove the warnings field from SDK response objects. The SDK shows warnings through the native warning channel of each language by default, and also keeps warning strings in the response object’s warnings field. An empty array means there are no warnings.
import warnings
from somark import SoMark, SoMarkWarning

# If you do not want Python programs to display SoMark warnings, filter them with warnings
warnings.filterwarnings("ignore", category=SoMarkWarning)

client = SoMark(api_key="sk-your-api-key")
response = client.parser.parse(file="./document.pdf")

print(response.warnings)  # list[str]

4. Parsing

Parsing is the main task for SoMark SDK + CLI. You give SoMark a file, and it returns Markdown, JSON, or a ZIP download URL. The default output format is md, which corresponds to the API’s markdown output. There are two usage patterns: sync parsing and async parsing. Sync parsing is for “I need the result now” scenarios. The CLI or SDK sends the file to /parse/sync and waits for the server to return the result. It takes little code, has low mental overhead, and works well for single files, scripts, debugging, and small to medium documents. The tradeoff is direct: the larger the file, the longer you wait. Async parsing is for large files, batch processing, and background tasks. You send the file to /parse/async, immediately get a task_id, then query /parse/async_check with that task_id for progress. The CLI --wait option and SDK task.wait() helper are polling wrappers. This is a better fit for queues, scheduled jobs, and server-side flows. Sync parsing flow One request returns the result directly. Best for scripts, debugging, and small to medium files.
Async parsing flow Submit the task first, then query status by task_id. Best for large files, batch processing, and background queues.

4.1 Sync parsing

Sync parsing is the easiest way to prove the flow works: provide one file, wait for the result, and save it locally. Start with md, check the content quality, then add json, zip, or page feature options as needed.
# One format without --out: writes to stdout, useful for pipes and scripts
somark parse ./document.pdf --formats md

# Quiet mode: hide SoMark warning output
somark --no-warnings parse ./document.pdf --formats md

# One format to a specific file; the parent directory must already exist
somark parse ./document.pdf --formats md --out ./document.md

# Multiple formats must go to an existing directory; creates parsed/document.md and parsed/document.json
mkdir -p ./parsed
somark parse ./document.pdf --formats md,json --out ./parsed/

# Separate multiple files with spaces; the batch output directory must also exist first
mkdir -p ./parsed
somark parse ./a.pdf ./b.pdf --out ./parsed/

# You can also read from a list file, one file path per line
mkdir -p ./parsed
somark parse --file-list ./files.txt --out ./parsed/

# Output Markdown with title-level recognition and HTML tables enabled
somark parse ./paper.pdf --formats md --title-levels --table-fmt html
Entry point: somark parse [files...].
CapabilityFlag / fieldDescription
File[files...]One or more file paths to parse. Separate multiple files with spaces.
File list--file-listRead a text file where each non-empty line is a file path; relative paths resolve from the current working directory.
Output formats--formatsSupports md/markdown, json, and zip.
Output target--outOutput file path, or an existing directory. The CLI does not create directories automatically.
Image format--image-fmturl, base64, or none.
Formula format--formula-fmtlatex, mathml, or ascii.
Table format--table-fmtmarkdown, html, or image.
Chemical structure format--cs-fmtCurrently supports image.
Title levels--title-levelsRecognize heading levels.
Text across pages--cross-page-textMerge text across pages.
Tables across pages--cross-page-tableMerge tables across pages.
Inline images--no-inline-imageDisable inline images.
Images in tables--no-table-imageDisable images in tables.
Image understanding--no-image-understandingDisable image understanding.
Header and footer--keep-header-footerKeep headers and footers.
Return handling--out only means the output target. It does not mean “create the directory for me” or “treat this filename as a template”. This makes script behavior easier to predict.
ScenarioBehavior
Single file, single format, no --outWrites to stdout.
Multiple formats, no --outErrors; multiple formats require --out pointing to an existing directory.
--out is an existing directoryGenerates .md, .json, or .zip by input file stem.
--out is a file pathAllowed only for a single format; parent directory must already exist.
--out is a nonexistent directoryErrors; create the directory yourself first.
ZIP outputThe response contains a download URL; saving as .zip downloads the content from that URL and writes it to file.
Multi-file execution
CapabilityFlag / fieldDescription
Concurrency limitEnvironment configDefault concurrency is 1; see section 2, “Authentication and configuration”.
Concurrency warningWarning channelWhen concurrency is set to 2 or above, a local warning reminds you that the official default quota is 1.
Progress and resultsCLI outputShows total file count, current progress, each file’s existence, status, duration, and output location.
Missing filemissing statusThe CLI does not send missing files, continues with other files, and exits with code 1.

4.2 Async parsing

Async parsing is best for large files and batch processing. Submit first, get a task_id, then poll until success or failure. A 3 to 5 second polling interval is recommended; avoid polling too frequently, because the server needs time to work.
# 1. Submit a task and only get task_id; do not wait for the result
somark parse ./large.pdf --async --formats md,json

# Submit multiple files at once; each file gets its own task_id
somark parse ./a.pdf ./b.pdf --async --formats md

# You can also read from a list file
somark parse --file-list ./files.txt --async --formats md

# 2. Wait for an existing task to finish and save the result locally
somark parse --task-id task_xxx --wait --out ./large.md

# 3. Query status once without blocking
somark parse --task-id task_xxx
Entry points: somark parse [files...] --async to submit tasks; somark parse --task-id task_xxx to query a task.Submit task: somark parse [files...] --async
CapabilityFlag / fieldDescription
File[files...]Required when submitting a task; separate multiple files with spaces.
File list--file-listRead a text file where each non-empty line is a file path; relative paths resolve from the current working directory.
Output formats--formatsSame as sync parsing.
When submitting multiple files asynchronously, each file receives its own task_id. The CLI shows each file’s existence, submission status, duration, and task ID.Query / wait task: somark parse --task-id task_xxx
CapabilityFlag / fieldDescription
Task ID--task-idQuery an existing task.
Wait switch--waitBlock until completion.
Output target--outSave the result after completion; rules are the same as sync parsing.

5. Usage query

Usage query returns the remaining quota and dashboard URL for the current API key. It can also quickly verify whether the API key is valid.
# Table output is the default and works well for humans
somark usage

# JSON is better for scripts
somark usage --format json

# text is lighter for logs
somark usage --format text
Entry point: somark usage.Command parameters
CapabilityFlag / fieldDescription
Output format--formatCLI display format. Default: table. Supports text, json, and table.
Output fields
CapabilityFlag / fieldDescription
Remaining paid pagesOutput fieldRemaining pages from all unexpired paid plans.
Remaining free pages todayOutput fieldToday’s free quota.
Remaining free pages this monthOutput fieldThis month’s free quota.
Dashboard URLOutput fieldSoMark console link.

6. SoMarkDown service

The SoMarkDown service starts a local preview server and opens a .md or .smd file in the browser. It does not require an API key and does not consume quota. The underlying rendering capability comes from SoMarkAI/SoMarkDown; go there when you need syntax and renderer details.
# Preview a SoMarkDown file and open the browser automatically
somark preview ./document.smd

# Set the local service host and port
somark preview ./document.md --host 127.0.0.1 --port 7878

# Start only the service, without opening the browser
somark preview --no-open
The JavaScript SDK also exports SoMarkDown. It does not start a local HTTP service; instead, it renders Markdown / SoMarkDown strings directly to HTML. Use it in Node services, custom frontends, or small tools when you need the rendered result.
JavaScript
import { SoMarkDown } from 'somark-js'

const renderer = new SoMarkDown()
const html = renderer.render('Inline: $e^{i\\pi} + 1 = 0$')

console.log(html)
Entry point: somark preview [file].
CapabilityFlag / fieldDescription
Preview file[file]Supports .md and .smd.
Bind address--hostDefault: 127.0.0.1.
Port--portDefault: 7878.
Auto-open browser--no-openDisable automatic browser opening.
Output fields
CapabilityFlag / fieldDescription
Service URLOutput fieldLocal service URL.
File URLOutput fieldPreview URL with the file parameter.
Stop the service by pressing Ctrl+C in the terminal.

7. PDF processing

PDF processing currently provides local PDF-to-image conversion. Use it for previews, page layout debugging, or handing PDF pages to another visual processing flow. The Python-side local PDF capability is based on SoMarkAI/SoPDF.
# Convert every page of a PDF to images in the current directory
somark pdf toimg ./document.pdf

# Set output directory and resolution
somark pdf toimg ./document.pdf --out ./pages --dpi 200

# Customize image filenames
somark pdf toimg ./document.pdf --format "{name}-{n}.png"
Entry point: somark pdf toimg <file>.
CapabilityFlag / fieldDescription
PDF file<file>PDF to convert.
Output directory--outDefault: ./.
Filename template--formatDefault: {name}.page-{n}.png.
Resolution--dpiDefault: 150.
Output fields
CapabilityFlag / fieldDescription
Result filesOutput fieldImage paths after successful conversion.
Error messageOutput fieldReturned when local dependencies are missing or conversion fails.

8. Doctor auto-fix

Doctor checks installation status, network access, API key configuration, and local preview assets. It is a command-line health check. It will not fix everything, but it catches many basic failures.
CLI
# Run basic diagnostics: config, network, and preview assets
somark doctor

# Try to repair SoMarkDown preview assets
somark doctor fix

# Only check network connectivity
somark doctor ping
Doctor is a CLI maintenance command; the SDK does not provide a corresponding resource. Programs usually do not need it. Use it when commands fail to run, preview does not open, or network status is uncertain. For lower-level endpoints, fields, and response structures, see API Reference. It is generated from the same interface specification: less hand-written text, more reliability.