CLI is best when you want to process files immediately. SDK is best when you want to put parsing inside your own program. Two paths, one goal. If you want the positioning summary first, return to the SoMark CLI & SDK overview.
SoMark has Python and JavaScript implementations. Install the package for the language you use. Both packages provide the SDK and the somark CLI command. As a baseline, Python requires 3.10+ and Node.js requires 18+.
# Python: install SDK and CLIpip install somark# JavaScript: install SDK and CLInpm install somark-js
If the somark command cannot be found, the package is usually installed correctly but the command directory is not in your PATH. Python users can try python -m somark.cli.main --help; Node users can try npx somark-js --help first. Once it runs, you can clean up your global command path.
If you install both the Python and JS versions, the CLI uses whichever one was installed first. The later installation skips CLI setup when it sees that the command already exists. To see which implementation is active, run somark --help; the footer shows [PY] or [JS].
Remote parsing and usage queries require an API key. Local PDF processing and SoMarkDown preview do not. You can save configuration through the CLI, or pass parameters directly when initializing the SDK.
If your project already uses .env, put SOMARK_API_KEY=sk-your-api-key there and load it with
python-dotenv, dotenv, or your own startup script. SoMark reads the environment variable itself;
it does not decide how your .env file is loaded. Do not commit .env to Git.
# Interactive login writes the API key to local configsomark login# You can also set, read, and inspect config directlysomark config set api_key sk-your-api-keysomark config get api_keysomark config list# For temporary use, environment variables are lighterexport SOMARK_API_KEY=sk-your-api-key
CLI
Python
JavaScript
Entry points: somark login, somark config ..., or configuration through command flags and environment variables.Command flags / environment variables
Capability
Flag / field
Description
API key
--api-key, SOMARK_API_KEY
Used for remote parsing and usage queries.
Base URL
--base-url, SOMARK_BASE_URL
Default: https://somark.tech/api/v1.
Timeout
--timeout, SOMARK_TIMEOUT
Measured in seconds.
Retry count
SOMARK_MAX_RETRIES
Default: 2.
Parse concurrency
SOMARK_PARSE_MAX_CONCURRENCY
Maximum concurrency for batch parsing. The default must remain 1.
Hide warnings
--no-warnings
Hide SoMark warning output; does not change command results, exit codes, or the warnings field on SDK response objects.
SOMARK_PARSE_MAX_CONCURRENCY must stay at 1 by default. The official default parsing concurrency for all users is also
1; only users who have been explicitly approved for higher concurrency should set it to 2 or above. When the CLI detects
2 or above, it reports this through the warning channel. This is a local warning, not an API warning.
SoMark warnings fall into two categories: API warnings and local warnings. API warnings come from the top-level response field warnings: List[str], at the same level as code and message. Local warnings come from SDK or CLI runtime checks, such as a batch parsing concurrency setting above the default quota.The CLI shows SoMark warnings by default. Use the global --no-warnings option when you need quiet output:
CLI
somark --no-warnings parse ./document.pdf
--no-warnings only hides warning display. It does not change command results, exit codes, or remove the warnings field from SDK response objects.The SDK shows warnings through the native warning channel of each language by default, and also keeps warning strings in the response object’s warnings field. An empty array means there are no warnings.
import warningsfrom somark import SoMark, SoMarkWarning# If you do not want Python programs to display SoMark warnings, filter them with warningswarnings.filterwarnings("ignore", category=SoMarkWarning)client = SoMark(api_key="sk-your-api-key")response = client.parser.parse(file="./document.pdf")print(response.warnings) # list[str]
Parsing is the main task for SoMark SDK + CLI. You give SoMark a file, and it returns Markdown, JSON, or a ZIP download URL. The default output format is md, which corresponds to the API’s markdown output.There are two usage patterns: sync parsing and async parsing.Sync parsing is for “I need the result now” scenarios. The CLI or SDK sends the file to /parse/sync and waits for the server to return the result. It takes little code, has low mental overhead, and works well for single files, scripts, debugging, and small to medium documents. The tradeoff is direct: the larger the file, the longer you wait.Async parsing is for large files, batch processing, and background tasks. You send the file to /parse/async, immediately get a task_id, then query /parse/async_check with that task_id for progress. The CLI --wait option and SDK task.wait() helper are polling wrappers. This is a better fit for queues, scheduled jobs, and server-side flows.Sync parsing flowOne request returns the result directly. Best for scripts, debugging, and small to medium files.Async parsing flowSubmit the task first, then query status by task_id. Best for large files, batch processing, and background queues.
Sync parsing is the easiest way to prove the flow works: provide one file, wait for the result, and save it locally. Start with md, check the content quality, then add json, zip, or page feature options as needed.
# One format without --out: writes to stdout, useful for pipes and scriptssomark parse ./document.pdf --formats md# Quiet mode: hide SoMark warning outputsomark --no-warnings parse ./document.pdf --formats md# One format to a specific file; the parent directory must already existsomark parse ./document.pdf --formats md --out ./document.md# Multiple formats must go to an existing directory; creates parsed/document.md and parsed/document.jsonmkdir -p ./parsedsomark parse ./document.pdf --formats md,json --out ./parsed/# Separate multiple files with spaces; the batch output directory must also exist firstmkdir -p ./parsedsomark parse ./a.pdf ./b.pdf --out ./parsed/# You can also read from a list file, one file path per linemkdir -p ./parsedsomark parse --file-list ./files.txt --out ./parsed/# Output Markdown with title-level recognition and HTML tables enabledsomark parse ./paper.pdf --formats md --title-levels --table-fmt html
CLI
Python
JavaScript
Entry point: somark parse [files...].
Capability
Flag / field
Description
File
[files...]
One or more file paths to parse. Separate multiple files with spaces.
File list
--file-list
Read a text file where each non-empty line is a file path; relative paths resolve from the current working directory.
Output formats
--formats
Supports md/markdown, json, and zip.
Output target
--out
Output file path, or an existing directory. The CLI does not create directories automatically.
Image format
--image-fmt
url, base64, or none.
Formula format
--formula-fmt
latex, mathml, or ascii.
Table format
--table-fmt
markdown, html, or image.
Chemical structure format
--cs-fmt
Currently supports image.
Title levels
--title-levels
Recognize heading levels.
Text across pages
--cross-page-text
Merge text across pages.
Tables across pages
--cross-page-table
Merge tables across pages.
Inline images
--no-inline-image
Disable inline images.
Images in tables
--no-table-image
Disable images in tables.
Image understanding
--no-image-understanding
Disable image understanding.
Header and footer
--keep-header-footer
Keep headers and footers.
Return handling--out only means the output target. It does not mean “create the directory for me” or “treat this filename as a template”. This makes script behavior easier to predict.
Scenario
Behavior
Single file, single format, no --out
Writes to stdout.
Multiple formats, no --out
Errors; multiple formats require --out pointing to an existing directory.
--out is an existing directory
Generates .md, .json, or .zip by input file stem.
--out is a file path
Allowed only for a single format; parent directory must already exist.
--out is a nonexistent directory
Errors; create the directory yourself first.
ZIP output
The response contains a download URL; saving as .zip downloads the content from that URL and writes it to file.
Multi-file execution
Capability
Flag / field
Description
Concurrency limit
Environment config
Default concurrency is 1; see section 2, “Authentication and configuration”.
Concurrency warning
Warning channel
When concurrency is set to 2 or above, a local warning reminds you that the official default quota is 1.
Progress and results
CLI output
Shows total file count, current progress, each file’s existence, status, duration, and output location.
Missing file
missing status
The CLI does not send missing files, continues with other files, and exits with code 1.
Entry point: client.parser.parse(...).
Capability
Flag / field
Description
File
file
File path to parse; pass list[str] for multiple files.
File list
file_list
When True, file is read as a list file; relative paths resolve from the current working directory.
Output formats
formats
Supports md/markdown, json, and zip.
Element formats
element_formats
Pass ElementFormats(...) or a dict with the same structure.
Extraction config
extract_config
Pass ExtractConfig(...) or a dict with the same structure.
Element formats: ElementFormats(...)
Capability
Flag / field
Description
Image format
ElementFormats.image
url, base64, or none.
Formula format
ElementFormats.formula
latex, mathml, or ascii.
Table format
ElementFormats.table
markdown, html, or image.
Chemical structure format
ElementFormats.cs
Currently supports image.
Extraction config: ExtractConfig(...)
Capability
Flag / field
Description
Title levels
ExtractConfig.enable_title_level_recognition
Recognize heading levels.
Text across pages
ExtractConfig.enable_text_cross_page
Merge text across pages.
Tables across pages
ExtractConfig.enable_table_cross_page
Merge tables across pages.
Inline images
ExtractConfig.enable_inline_image
Configured as a boolean in the SDK.
Images in tables
ExtractConfig.enable_table_image
Configured as a boolean in the SDK.
Image understanding
ExtractConfig.enable_image_understanding
Configured as a boolean in the SDK.
Header and footer
ExtractConfig.keep_header_footer
Keep headers and footers.
Save result: response.save(path, format)
Capability
Flag / field
Description
Save path
path
Output file path, or an existing directory.
Save format
format
Optional; supports one format or multiple formats.
path can be a file path or an existing directory. Multiple formats can only be saved to an existing directory. The SDK does not create directories automatically. Saving zip downloads the content from response.zip_url; without saving, read response.md, response.json_output, and response.zip_url directly.Response fields: ParseResponse
Capability
Flag / field
Description
Warnings
response.warnings
API warning and local warning string list; an empty array means there are no warnings.
Multi-file return value
Capability
Flag / field
Description
Return type
list[ParseResponse]
Returned only for multiple files or file_list=True.
Return order
Input order
The returned list stays in input order even if concurrent jobs finish in a different order.
Max concurrency
Environment config
Default concurrency is 1; see section 2, “Authentication and configuration”.
Missing file
InvalidParamError
The SDK checks all files first; if any file is missing, it sends no requests.
Entry point: client.parser.parse(...).
Capability
Flag / field
Description
File
file
File path to parse; pass string[] for multiple files.
File list
fileList
When true, file is read as a list file; relative paths resolve from the current working directory.
Output formats
formats
Supports md/markdown, json, and zip.
Element formats
elementFormats
Pass an element formats object.
Extraction config
extractConfig
Pass an extraction config object.
Element formats: elementFormats
Capability
Flag / field
Description
Image format
elementFormats.image
url, base64, or none.
Formula format
elementFormats.formula
latex, mathml, or ascii.
Table format
elementFormats.table
markdown, html, or image.
Chemical structure format
elementFormats.cs
Currently supports image.
Extraction config: extractConfig
Capability
Flag / field
Description
Title levels
extractConfig.enableTitleLevelRecognition
Recognize heading levels.
Text across pages
extractConfig.enableTextCrossPage
Merge text across pages.
Tables across pages
extractConfig.enableTableCrossPage
Merge tables across pages.
Inline images
extractConfig.enableInlineImage
Configured as a boolean in the SDK.
Images in tables
extractConfig.enableTableImage
Configured as a boolean in the SDK.
Image understanding
extractConfig.enableImageUnderstanding
Configured as a boolean in the SDK.
Header and footer
extractConfig.keepHeaderFooter
Keep headers and footers.
Save result: response.save(path, format)
Capability
Flag / field
Description
Save path
path
Output file path, or an existing directory.
Save format
format
Optional; supports one format or multiple formats.
path can be a file path or an existing directory. Multiple formats can only be saved to an existing directory. The SDK does not create directories automatically. Saving zip downloads the content from response.zipUrl; without saving, read response.md, response.jsonOutput, and response.zipUrl directly.Response fields: ParseResponse
Capability
Flag / field
Description
Warnings
response.warnings
API warning and local warning string array; an empty array means there are no warnings.
Multi-file return value
Capability
Flag / field
Description
Return type
Promise<ParseResponse[]>
Returned only for multiple files or fileList: true.
Return order
Input order
The returned array stays in input order even if concurrent jobs finish in a different order.
Max concurrency
Environment config
Default concurrency is 1; see section 2, “Authentication and configuration”.
Missing file
InvalidParamError
The SDK checks all files first; if any file is missing, it sends no requests.
Async parsing is best for large files and batch processing. Submit first, get a task_id, then poll until success or failure. A 3 to 5 second polling interval is recommended; avoid polling too frequently, because the server needs time to work.
# 1. Submit a task and only get task_id; do not wait for the resultsomark parse ./large.pdf --async --formats md,json# Submit multiple files at once; each file gets its own task_idsomark parse ./a.pdf ./b.pdf --async --formats md# You can also read from a list filesomark parse --file-list ./files.txt --async --formats md# 2. Wait for an existing task to finish and save the result locallysomark parse --task-id task_xxx --wait --out ./large.md# 3. Query status once without blockingsomark parse --task-id task_xxx
CLI
Python
JavaScript
Entry points: somark parse [files...] --async to submit tasks; somark parse --task-id task_xxx to query a task.Submit task: somark parse [files...] --async
Capability
Flag / field
Description
File
[files...]
Required when submitting a task; separate multiple files with spaces.
File list
--file-list
Read a text file where each non-empty line is a file path; relative paths resolve from the current working directory.
Output formats
--formats
Same as sync parsing.
When submitting multiple files asynchronously, each file receives its own task_id. The CLI shows each file’s existence, submission status, duration, and task ID.Query / wait task: somark parse --task-id task_xxx
Capability
Flag / field
Description
Task ID
--task-id
Query an existing task.
Wait switch
--wait
Block until completion.
Output target
--out
Save the result after completion; rules are the same as sync parsing.
Entry point: submit tasks with client.parser.create(...).Submit task: client.parser.create(...)
Capability
Flag / field
Description
File
file
Required when submitting a task; pass list[str] for multiple files.
File list
file_list
When True, file is read as a list file; relative paths resolve from the current working directory.
Output formats
formats
Same as sync parsing.
Multi-file submission returns list[Task] in input order.Task object fields
Capability
Flag / field
Description
Task ID
task.id
Provided by the task object returned after submission.
Warnings
task.warnings
Warning string list received during task submission or status query; an empty array means there are no warnings.
Wait for result: task.wait(...)
Capability
Flag / field
Description
Poll interval
poll_interval
Measured in seconds. Default: 3.
Timeout
timeout
Measured in seconds. Default: 120.
Query status: task.refresh()
Capability
Flag / field
Description
Parameters
None
Updates task status, page count, filename, and related information.
Save result: result.save(path, format)
Capability
Flag / field
Description
Save path
path
Output file path, or an existing directory.
Save format
format
Optional; supports one format or multiple formats.
Entry point: submit tasks with client.parser.create(...).Submit task: client.parser.create(...)
Capability
Flag / field
Description
File
file
Required when submitting a task; pass string[] for multiple files.
File list
fileList
When true, file is read as a list file; relative paths resolve from the current working directory.
Output formats
formats
Same as sync parsing.
Multi-file submission returns Task[] in input order.Task object fields
Capability
Flag / field
Description
Task ID
task.id
Provided by the task object returned after submission.
Warnings
task.warnings
Warning string array received during task submission or status query; an empty array means there are no warnings.
Wait for result: task.wait(...)
Capability
Flag / field
Description
Poll interval
pollInterval
Measured in seconds. Default: 3.
Timeout
timeout
Measured in seconds. Default: 120.
Query status: task.refresh()
Capability
Flag / field
Description
Parameters
None
Updates task status, page count, filename, and related information.
Save result: result.save(path, format)
Capability
Flag / field
Description
Save path
path
Output file path, or an existing directory.
Save format
format
Optional; supports one format or multiple formats.
Usage query returns the remaining quota and dashboard URL for the current API key. It can also quickly verify whether the API key is valid.
# Table output is the default and works well for humanssomark usage# JSON is better for scriptssomark usage --format json# text is lighter for logssomark usage --format text
CLI
Python
JavaScript
Entry point: somark usage.Command parameters
Capability
Flag / field
Description
Output format
--format
CLI display format. Default: table. Supports text, json, and table.
Output fields
Capability
Flag / field
Description
Remaining paid pages
Output field
Remaining pages from all unexpired paid plans.
Remaining free pages today
Output field
Today’s free quota.
Remaining free pages this month
Output field
This month’s free quota.
Dashboard URL
Output field
SoMark console link.
Entry point: client.usage.get(). This method does not need additional parameters.Return fields
Capability
Flag / field
Description
Remaining paid pages
remaining_paid_pages
Remaining pages from all unexpired paid plans.
Remaining free pages today
remaining_free_pages_today
Today’s free quota.
Remaining free pages this month
remaining_free_pages_this_month
This month’s free quota.
Dashboard URL
dashboard_url
SoMark console link.
Warnings
usage.warnings
API warning string list; an empty array means there are no warnings.
Entry point: client.usage.get(). This method does not need additional parameters.Return fields
Capability
Flag / field
Description
Remaining paid pages
remainingPaidPages
Remaining pages from all unexpired paid plans.
Remaining free pages today
remainingFreePagesToday
Today’s free quota.
Remaining free pages this month
remainingFreePagesThisMonth
This month’s free quota.
Dashboard URL
dashboardUrl
SoMark console link.
Warnings
usage.warnings
API warning string array; an empty array means there are no warnings.
The SoMarkDown service starts a local preview server and opens a .md or .smd file in the browser. It does not require an API key and does not consume quota. The underlying rendering capability comes from SoMarkAI/SoMarkDown; go there when you need syntax and renderer details.
# Preview a SoMarkDown file and open the browser automaticallysomark preview ./document.smd# Set the local service host and portsomark preview ./document.md --host 127.0.0.1 --port 7878# Start only the service, without opening the browsersomark preview --no-open
The JavaScript SDK also exports SoMarkDown. It does not start a local HTTP service; instead, it renders
Markdown / SoMarkDown strings directly to HTML. Use it in Node services, custom frontends, or small tools
when you need the rendered result.
JavaScript
import { SoMarkDown } from 'somark-js'const renderer = new SoMarkDown()const html = renderer.render('Inline: $e^{i\\pi} + 1 = 0$')console.log(html)
CLI
Python
JavaScript
Entry point: somark preview [file].
Capability
Flag / field
Description
Preview file
[file]
Supports .md and .smd.
Bind address
--host
Default: 127.0.0.1.
Port
--port
Default: 7878.
Auto-open browser
--no-open
Disable automatic browser opening.
Output fields
Capability
Flag / field
Description
Service URL
Output field
Local service URL.
File URL
Output field
Preview URL with the file parameter.
Stop the service by pressing Ctrl+C in the terminal.
Entry point: SoMarkDownPreview().start(...).
Capability
Flag / field
Description
Preview file
file
Supports .md and .smd.
Bind address
host
Default: 127.0.0.1.
Port
port
Default: 7878.
Auto-open browser
open_browser
Configured as a boolean in the SDK.
Returned object fields
Capability
Flag / field
Description
Service URL
server.url
Local service URL.
File URL
server.file_url
Preview URL with the file parameter.
Stop service: server.stop()
Capability
Flag / field
Description
Parameters
None
Stops the local preview service.
Entry point: new SoMarkDownPreview().start(...).
Capability
Flag / field
Description
Preview file
file
Supports .md and .smd.
Bind address
host
Default: 127.0.0.1.
Port
port
Default: 7878.
Auto-open browser
openBrowser
Configured as a boolean in the SDK.
Returned object fields
Capability
Flag / field
Description
Service URL
server.url
Local service URL.
File URL
server.fileUrl
Preview URL with the file parameter.
Stop service: server.stop()
Capability
Flag / field
Description
Parameters
None
Stops the local preview service.
String rendering: new SoMarkDown().render(markdown)
PDF processing currently provides local PDF-to-image conversion. Use it for previews, page layout debugging, or handing PDF pages to another visual processing flow. The Python-side local PDF capability is based on SoMarkAI/SoPDF.
# Convert every page of a PDF to images in the current directorysomark pdf toimg ./document.pdf# Set output directory and resolutionsomark pdf toimg ./document.pdf --out ./pages --dpi 200# Customize image filenamessomark pdf toimg ./document.pdf --format "{name}-{n}.png"
CLI
Python
JavaScript
Entry point: somark pdf toimg <file>.
Capability
Flag / field
Description
PDF file
<file>
PDF to convert.
Output directory
--out
Default: ./.
Filename template
--format
Default: {name}.page-{n}.png.
Resolution
--dpi
Default: 150.
Output fields
Capability
Flag / field
Description
Result files
Output field
Image paths after successful conversion.
Error message
Output field
Returned when local dependencies are missing or conversion fails.
Entry point: PDFProcessor().to_images(...).
Capability
Flag / field
Description
PDF file
file
PDF to convert.
Output directory
out
Default: ./.
Filename template
filename_format
Default: {name}.page-{n}.png.
Resolution
dpi
Default: 150.
Return fields
Capability
Flag / field
Description
Result files
result.files
Image paths after successful conversion.
Error message
result.error
Returned when local dependencies are missing or conversion fails.
Entry point: new PDFProcessor().toImages(...).
Capability
Flag / field
Description
PDF file
file
PDF to convert.
Output directory
out
Default: ./.
Filename template
filenameFormat
Default: {name}.page-{n}.png.
Resolution
dpi
Default: 150.
Return fields
Capability
Flag / field
Description
Result files
result.files
Image paths after successful conversion.
Error message
result.error
Returned when local dependencies are missing or conversion fails.
Doctor checks installation status, network access, API key configuration, and local preview assets. It is a command-line health check. It will not fix everything, but it catches many basic failures.
CLI
# Run basic diagnostics: config, network, and preview assetssomark doctor# Try to repair SoMarkDown preview assetssomark doctor fix# Only check network connectivitysomark doctor ping
Doctor is a CLI maintenance command; the SDK does not provide a corresponding resource. Programs usually do not need it. Use it when commands fail to run, preview does not open, or network status is uncertain.For lower-level endpoints, fields, and response structures, see API Reference. It is generated from the same interface specification: less hand-written text, more reliability.