OpenAI APIs - Embedding#
SGLang provides OpenAI-compatible APIs to enable a smooth transition from OpenAI services to self-hosted local models. A complete reference for the API is available in the OpenAI API Reference.
This tutorial covers the embedding APIs for embedding models. For a list of the supported models see the corresponding overview page
Launch A Server#
Launch the server in your terminal and wait for it to initialize. Remember to add --is-embedding
to the command.
[1]:
from sglang.test.test_utils import is_in_ci
if is_in_ci():
from patch import launch_server_cmd
else:
from sglang.utils import launch_server_cmd
from sglang.utils import wait_for_server, print_highlight, terminate_process
embedding_process, port = launch_server_cmd(
"""
python3 -m sglang.launch_server --model-path Alibaba-NLP/gte-Qwen2-1.5B-instruct \
--host 0.0.0.0 --is-embedding
"""
)
wait_for_server(f"http://localhost:{port}")
[2025-06-23 19:14:05] server_args=ServerArgs(model_path='Alibaba-NLP/gte-Qwen2-1.5B-instruct', tokenizer_path='Alibaba-NLP/gte-Qwen2-1.5B-instruct', tokenizer_mode='auto', skip_tokenizer_init=False, load_format='auto', trust_remote_code=False, dtype='auto', kv_cache_dtype='auto', quantization=None, quantization_param_path=None, context_length=None, device='cuda', served_model_name='Alibaba-NLP/gte-Qwen2-1.5B-instruct', chat_template=None, completion_template=None, is_embedding=True, enable_multimodal=None, revision=None, impl='auto', host='0.0.0.0', port=36268, mem_fraction_static=0.874, max_running_requests=200, max_total_tokens=20480, chunked_prefill_size=8192, max_prefill_tokens=16384, schedule_policy='fcfs', schedule_conservativeness=1.0, cpu_offload_gb=0, page_size=1, tp_size=1, pp_size=1, max_micro_batch_size=None, stream_interval=1, stream_output=False, random_seed=818751212, constrained_json_whitespace_pattern=None, watchdog_timeout=300, dist_timeout=None, download_dir=None, base_gpu_id=0, gpu_id_step=1, sleep_on_idle=False, log_level='info', log_level_http=None, log_requests=False, log_requests_level=0, show_time_cost=False, enable_metrics=False, bucket_time_to_first_token=None, bucket_e2e_request_latency=None, bucket_inter_token_latency=None, collect_tokens_histogram=False, decode_log_interval=40, enable_request_time_stats_logging=False, kv_events_config=None, api_key=None, file_storage_path='sglang_storage', enable_cache_report=False, reasoning_parser=None, tool_call_parser=None, dp_size=1, load_balance_method='round_robin', dist_init_addr=None, nnodes=1, node_rank=0, json_model_override_args='{}', preferred_sampling_params=None, lora_paths=None, max_loras_per_batch=8, lora_backend='triton', attention_backend=None, sampling_backend='flashinfer', grammar_backend='xgrammar', mm_attention_backend=None, speculative_algorithm=None, speculative_draft_model_path=None, speculative_num_steps=None, speculative_eagle_topk=None, speculative_num_draft_tokens=None, speculative_accept_threshold_single=1.0, speculative_accept_threshold_acc=1.0, speculative_token_map=None, ep_size=1, enable_ep_moe=False, enable_deepep_moe=False, enable_flashinfer_moe=False, deepep_mode='auto', ep_num_redundant_experts=0, ep_dispatch_algorithm='static', init_expert_location='trivial', enable_eplb=False, eplb_algorithm='auto', eplb_rebalance_num_iterations=1000, eplb_rebalance_layers_per_chunk=None, expert_distribution_recorder_mode=None, expert_distribution_recorder_buffer_size=1000, enable_expert_distribution_metrics=False, deepep_config=None, moe_dense_tp_size=None, enable_double_sparsity=False, ds_channel_config_path=None, ds_heavy_channel_num=32, ds_heavy_token_num=256, ds_heavy_channel_type='qk', ds_sparse_decode_threshold=4096, disable_radix_cache=False, cuda_graph_max_bs=None, cuda_graph_bs=None, disable_cuda_graph=True, disable_cuda_graph_padding=False, enable_profile_cuda_graph=False, enable_nccl_nvls=False, enable_tokenizer_batch_encode=False, disable_outlines_disk_cache=False, disable_custom_all_reduce=False, enable_mscclpp=False, disable_overlap_schedule=False, disable_overlap_cg_plan=False, enable_mixed_chunk=False, enable_dp_attention=False, enable_dp_lm_head=False, enable_two_batch_overlap=False, enable_torch_compile=False, torch_compile_max_bs=32, torchao_config='', enable_nan_detection=False, enable_p2p_check=False, triton_attention_reduce_in_fp32=False, triton_attention_num_kv_splits=8, num_continuous_decode_steps=1, delete_ckpt_after_loading=False, enable_memory_saver=False, allow_auto_truncate=False, enable_custom_logit_processor=False, enable_hierarchical_cache=False, hicache_ratio=2.0, hicache_size=0, hicache_write_policy='write_through_selective', flashinfer_mla_disable_ragged=False, disable_shared_experts_fusion=False, disable_chunked_prefix_cache=False, disable_fast_image_processor=False, enable_return_hidden_states=False, warmups=None, debug_tensor_dump_output_folder=None, debug_tensor_dump_input_file=None, debug_tensor_dump_inject=False, debug_tensor_dump_prefill_only=False, disaggregation_mode='null', disaggregation_transfer_backend='mooncake', disaggregation_bootstrap_port=8998, disaggregation_decode_tp=None, disaggregation_decode_dp=None, disaggregation_prefill_pp=1, disaggregation_ib_device=None, num_reserved_decode_tokens=512, pdlb_url=None, custom_weight_loader=[])
[2025-06-23 19:14:08] Downcasting torch.float32 to torch.float16.
[2025-06-23 19:14:16] Downcasting torch.float32 to torch.float16.
[2025-06-23 19:14:16] Overlap scheduler is disabled for embedding models.
[2025-06-23 19:14:16] Downcasting torch.float32 to torch.float16.
[2025-06-23 19:14:16] Attention backend not set. Use fa3 backend by default.
[2025-06-23 19:14:16] Init torch distributed begin.
[2025-06-23 19:14:17] Init torch distributed ends. mem usage=0.00 GB
[2025-06-23 19:14:17] Load weight begin. avail mem=60.49 GB
[2025-06-23 19:14:18] Using model weights format ['*.safetensors']
Loading safetensors checkpoint shards: 0% Completed | 0/2 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 50% Completed | 1/2 [00:01<00:01, 1.53s/it]
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:02<00:00, 1.00s/it]
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:02<00:00, 1.08s/it]
[2025-06-23 19:14:20] Load weight end. type=Qwen2ForCausalLM, dtype=torch.float16, avail mem=56.99 GB, mem usage=3.51 GB.
[2025-06-23 19:14:20] KV Cache is allocated. #tokens: 20480, K size: 0.27 GB, V size: 0.27 GB
[2025-06-23 19:14:20] Memory pool end. avail mem=56.16 GB
[2025-06-23 19:14:20] max_total_num_tokens=20480, chunked_prefill_size=8192, max_prefill_tokens=16384, max_running_requests=200, context_len=131072, available_gpu_mem=56.07 GB
[2025-06-23 19:14:21] INFO: Started server process [282468]
[2025-06-23 19:14:21] INFO: Waiting for application startup.
[2025-06-23 19:14:21] INFO: Application startup complete.
[2025-06-23 19:14:21] INFO: Uvicorn running on http://0.0.0.0:36268 (Press CTRL+C to quit)
[2025-06-23 19:14:22] INFO: 127.0.0.1:54548 - "GET /v1/models HTTP/1.1" 200 OK
[2025-06-23 19:14:22] INFO: 127.0.0.1:54562 - "GET /get_model_info HTTP/1.1" 200 OK
[2025-06-23 19:14:22] Prefill batch. #new-seq: 1, #new-token: 6, #cached-token: 0, token usage: 0.00, #running-req: 0, #queue-req: 0
[2025-06-23 19:14:22] INFO: 127.0.0.1:54566 - "POST /encode HTTP/1.1" 200 OK
[2025-06-23 19:14:22] The server is fired up and ready to roll!
NOTE: Typically, the server runs in a separate terminal.
In this notebook, we run the server and notebook code together, so their outputs are combined.
To improve clarity, the server logs are displayed in the original black color, while the notebook outputs are highlighted in blue.
We are running those notebooks in a CI parallel environment, so the throughput is not representative of the actual performance.
Using cURL#
[2]:
import subprocess, json
text = "Once upon a time"
curl_text = f"""curl -s http://localhost:{port}/v1/embeddings \
-H "Content-Type: application/json" \
-d '{{"model": "Alibaba-NLP/gte-Qwen2-1.5B-instruct", "input": "{text}"}}'"""
result = subprocess.check_output(curl_text, shell=True)
print(result)
text_embedding = json.loads(result)["data"][0]["embedding"]
print_highlight(f"Text embedding (first 10): {text_embedding[:10]}")
[2025-06-23 19:14:27] Prefill batch. #new-seq: 1, #new-token: 4, #cached-token: 0, token usage: 0.00, #running-req: 0, #queue-req: 0
[2025-06-23 19:14:27] INFO: 127.0.0.1:34162 - "POST /v1/embeddings HTTP/1.1" 200 OK
b'{"data":[{"embedding":[-0.00023102760314941406,-0.04986572265625,-0.0032711029052734375,0.011077880859375,-0.0140533447265625,0.0159912109375,-0.01441192626953125,0.0059051513671875,-0.0228424072265625,0.0272979736328125,0.0014867782592773438,0.048370361328125,-0.001552581787109375,0.045257568359375,-0.01074981689453125,-0.00980377197265625,0.023040771484375,0.0272064208984375,0.00907135009765625,0.01212310791015625,-0.02362060546875,-0.0095672607421875,-0.03924560546875,-0.02520751953125,0.00032067298889160156,0.0022411346435546875,-0.010040283203125,-0.00238800048828125,0.025299072265625,0.00014603137969970703,-0.0235748291015625,-0.006145477294921875,-0.00872802734375,0.052978515625,0.004512786865234375,-0.0248565673828125,-0.00978851318359375,0.0307769775390625,-0.005023956298828125,0.0186004638671875,0.017486572265625,0.00415802001953125,-0.02264404296875,0.00405120849609375,0.03826904296875,0.0072479248046875,-0.0176849365234375,0.0282745361328125,-0.00027441978454589844,0.0208892822265625,-0.024505615234375,-0.01213836669921875,-0.00527191162109375,-0.0049591064453125,0.02935791015625,-0.0055389404296875,0.01332855224609375,-0.007720947265625,0.0029697418212890625,-0.014129638671875,0.0012693405151367188,0.0198211669921875,0.00025582313537597656,-0.0002351999282836914,-0.011383056640625,0.00484466552734375,-0.0178070068359375,-0.0142364501953125,0.00734710693359375,0.00424957275390625,0.0211944580078125,-0.0057220458984375,0.0166778564453125,0.01520538330078125,-0.01372528076171875,-0.0011196136474609375,-0.01551055908203125,-0.006420135498046875,-0.0017194747924804688,0.025238037109375,0.02044677734375,0.01084136962890625,0.0002701282501220703,-0.0458984375,-0.0012845993041992188,0.006633758544921875,0.0275421142578125,-0.01241302490234375,0.0063323974609375,0.0213470458984375,0.016876220703125,0.00010514259338378906,0.1868896484375,0.0260009765625,-0.03326416015625,0.014678955078125,-0.0222015380859375,-0.0224761962890625,-0.017364501953125,0.00923919677734375,0.026214599609375,-0.002033233642578125,-0.00731658935546875,-0.0137176513671875,-0.0157470703125,-0.0269775390625,-0.369873046875,-0.0021038055419921875,0.00988006591796875,0.00286865234375,0.0282135009765625,-0.01220703125,-0.0262908935546875,-0.041717529296875,-0.01558685302734375,-0.0018243789672851562,0.0145263671875,-0.00493621826171875,-0.00986480712890625,0.007965087890625,0.005237579345703125,-0.0273284912109375,-0.006999969482421875,0.0005784034729003906,-0.03692626953125,0.0946044921875,0.0212249755859375,-0.0108489990234375,-0.04840087890625,-0.012908935546875,0.019927978515625,0.005405426025390625,-0.021392822265625,0.00308990478515625,0.0272369384765625,-0.039703369140625,-0.0017156600952148438,-0.011444091796875,0.005893707275390625,-0.0037593841552734375,0.0020751953125,-0.005161285400390625,-0.0105438232421875,0.0102691650390625,-0.028045654296875,-0.028289794921875,0.020721435546875,0.006595611572265625,0.01427459716796875,0.0018901824951171875,-0.0003972053527832031,-0.01491546630859375,0.009674072265625,0.004974365234375,-0.00925445556640625,0.05401611328125,0.0119476318359375,0.0108184814453125,0.0233306884765625,0.0264739990234375,-0.0059051513671875,0.00807952880859375,0.01110076904296875,-0.00616455078125,0.0057830810546875,-0.0233612060546875,0.0311737060546875,0.0007991790771484375,-0.03509521484375,0.0010995864868164062,0.0033740997314453125,-0.0167388916015625,0.020477294921875,0.00936126708984375,0.0010061264038085938,-0.00276947021484375,-0.00954437255859375,0.00525665283203125,-0.01418304443359375,-0.041961669921875,0.015716552734375,0.0111083984375,0.006374359130859375,0.01381683349609375,0.007228851318359375,-0.017913818359375,0.0018749237060546875,0.0027828216552734375,-0.09197998046875,-0.00637054443359375,-0.0021190643310546875,0.00600433349609375,0.0254058837890625,-0.00524139404296875,0.0215911865234375,-0.0001697540283203125,-0.01505279541015625,-0.00872039794921875,0.01180267333984375,-0.028228759765625,-0.0012035369873046875,-0.004497528076171875,-0.00911712646484375,-0.006168365478515625,0.028656005859375,-0.004810333251953125,-0.0128936767578125,-0.0241546630859375,0.0085601806640625,-0.029052734375,-0.0035037994384765625,0.00725555419921875,-0.0172119140625,-0.01198577880859375,0.00394439697265625,-0.0112762451171875,-0.0023040771484375,-0.021392822265625,-0.00669097900390625,0.0223236083984375,-0.006923675537109375,-0.0380859375,-0.011016845703125,-0.0022106170654296875,0.00868988037109375,-0.0001493692398071289,-0.005008697509765625,0.01374053955078125,-0.00640106201171875,-0.02410888671875,-0.00424957275390625,-0.0027828216552734375,-0.0004031658172607422,-0.00466156005859375,-0.00035071372985839844,-0.00521087646484375,0.0238800048828125,0.0079193115234375,-0.01340484619140625,-0.019073486328125,0.0198516845703125,0.0096893310546875,-0.01495361328125,0.0074920654296875,-0.01751708984375,-0.0060272216796875,0.01317596435546875,0.01303863525390625,-0.0137786865234375,-0.007465362548828125,0.0157470703125,-0.0209503173828125,-0.00856781005859375,0.005962371826171875,0.0222625732421875,-0.0161895751953125,0.015838623046875,-0.011474609375,0.0037975311279296875,0.00980377197265625,-0.002750396728515625,0.028076171875,-0.006256103515625,-0.00214385986328125,-0.003631591796875,0.0189208984375,0.01132965087890625,-0.0165252685546875,-0.01456451416015625,-0.01287078857421875,-0.0224456787109375,-0.01200103759765625,-0.01099395751953125,0.0036106109619140625,0.01325225830078125,-0.00261688232421875,-0.024017333984375,0.016143798828125,0.0340576171875,-0.007289886474609375,0.01364898681640625,0.00673675537109375,0.023162841796875,-0.0177764892578125,-0.018890380859375,-0.00809478759765625,-0.04248046875,0.01381683349609375,0.00835418701171875,0.0241546630859375,-0.01053619384765625,-0.0030918121337890625,-0.01397705078125,0.00044417381286621094,0.0268402099609375,-0.01445770263671875,0.0101318359375,-0.01079559326171875,-0.02203369140625,-0.016326904296875,0.018768310546875,0.0031452178955078125,-0.006710052490234375,-0.005107879638671875,-0.0142364501953125,-0.01074981689453125,-0.01038360595703125,0.035308837890625,-0.006717681884765625,-0.00405120849609375,0.007965087890625,0.00945281982421875,-0.007843017578125,0.0116119384765625,-0.0193634033203125,-0.0013799667358398438,0.01392364501953125,-0.004878997802734375,0.0011949539184570312,-0.003711700439453125,0.01214599609375,0.0091705322265625,0.038818359375,0.01145172119140625,0.059112548828125,0.0178070068359375,0.024322509765625,0.0013246536254882812,0.0136260986328125,-0.0294342041015625,-0.0281524658203125,-0.089599609375,0.018157958984375,0.0261688232421875,0.0115814208984375,0.009857177734375,0.0047607421875,0.049957275390625,0.0247802734375,-0.0205535888671875,0.025848388671875,0.004642486572265625,-0.02471923828125,0.006534576416015625,-0.007373809814453125,-0.02398681640625,-0.0005130767822265625,0.0172271728515625,0.0298309326171875,0.01953125,-0.0092010498046875,-0.0026607513427734375,-0.0011911392211914062,0.0020847320556640625,-0.018463134765625,0.029937744140625,-0.0090179443359375,-0.0008406639099121094,-0.00469970703125,0.0009584426879882812,-0.00493621826171875,0.0247650146484375,0.00855255126953125,-0.01513671875,-0.01113128662109375,0.0130462646484375,0.0179290771484375,-0.03271484375,0.0238494873046875,-0.0010318756103515625,-0.0175323486328125,-0.0007572174072265625,0.01898193359375,-0.0158843994140625,0.000728607177734375,-0.004680633544921875,-0.00405120849609375,0.00872039794921875,-0.01114654541015625,-0.009002685546875,-0.0022487640380859375,-0.00717926025390625,0.00594329833984375,0.010040283203125,0.02044677734375,0.019989013671875,0.00836944580078125,-0.022918701171875,-0.06927490234375,-0.0120849609375,0.038116455078125,-0.0107269287109375,0.0003495216369628906,-0.0110626220703125,0.0004513263702392578,-0.0148162841796875,0.003582000732421875,-0.029937744140625,0.0252838134765625,0.0204315185546875,-0.02032470703125,0.0013914108276367188,-0.0338134765625,-0.006504058837890625,-0.0025234222412109375,0.003139495849609375,0.0025882720947265625,0.00605010986328125,0.0216522216796875,-0.00824737548828125,0.007022857666015625,0.0775146484375,0.00201416015625,-0.0030002593994140625,-0.001422882080078125,-0.0121612548828125,-0.0242462158203125,0.01531982421875,0.0289306640625,0.0016031265258789062,-0.01123809814453125,0.0016222000122070312,0.016265869140625,0.004283905029296875,-0.01491546630859375,0.0084381103515625,0.0014743804931640625,0.0001951456069946289,0.006664276123046875,-0.02386474609375,-0.01305389404296875,0.0118560791015625,0.00882720947265625,-0.003086090087890625,-0.0266265869140625,-0.0196533203125,0.0222930908203125,0.0112457275390625,-0.0160675048828125,-0.039276123046875,0.033447265625,0.0193939208984375,-0.0274200439453125,-0.0065155029296875,0.0284271240234375,-0.01312255859375,-0.0176239013671875,-0.05792236328125,0.006175994873046875,-0.0248870849609375,0.0194854736328125,-0.0074005126953125,-0.0145111083984375,-0.0019989013671875,-0.0078277587890625,-0.0231170654296875,0.006374359130859375,-0.0230712890625,-0.0011529922485351562,0.00611114501953125,0.007106781005859375,0.00579833984375,-0.00897979736328125,0.01934814453125,0.049652099609375,0.0154266357421875,-0.0008764266967773438,-0.0284423828125,0.01247406005859375,-0.0189666748046875,0.00417327880859375,0.0108184814453125,-0.025543212890625,-0.2486572265625,-0.0028438568115234375,-0.0216522216796875,0.01415252685546875,0.017242431640625,0.0305023193359375,-0.03466796875,0.01837158203125,-0.01910400390625,-0.016693115234375,-0.020843505859375,-0.0082855224609375,0.037078857421875,0.01287841796875,-0.0001322031021118164,0.0154876708984375,0.03155517578125,0.01540374755859375,-0.000988006591796875,-0.005096435546875,0.033935546875,-0.004276275634765625,0.01058197021484375,0.0003693103790283203,-0.01708984375,0.01064300537109375,-0.00902557373046875,0.01010894775390625,0.01018524169921875,0.00739288330078125,0.0173797607421875,-0.02056884765625,0.003475189208984375,0.0006489753723144531,0.032318115234375,0.032501220703125,-0.00039005279541015625,-0.01158905029296875,0.00704193115234375,-0.0135040283203125,0.012603759765625,0.01074981689453125,0.042938232421875,0.0202178955078125,0.010223388671875,0.004161834716796875,-0.006420135498046875,-0.0037288665771484375,-0.0033359527587890625,-0.0089263916015625,0.021484375,0.01137542724609375,0.007289886474609375,-0.038604736328125,-0.018768310546875,-0.0166778564453125,-0.010894775390625,0.00905609130859375,0.038787841796875,0.02264404296875,-0.0077362060546875,-0.007755279541015625,0.007534027099609375,-0.035247802734375,-0.010162353515625,-0.01180267333984375,-0.0203399658203125,0.0014295578002929688,0.0077362060546875,0.01178741455078125,-0.0018157958984375,-0.0406494140625,0.004207611083984375,-0.00400543212890625,-0.0037841796875,-0.0189208984375,-0.00794219970703125,0.025970458984375,0.0345458984375,0.0081787109375,-0.017364501953125,0.0014467239379882812,0.0017423629760742188,-0.00811767578125,-0.00833892822265625,-0.01263427734375,-0.09521484375,0.04534912109375,0.0020427703857421875,0.0224761962890625,-0.018585205078125,-0.0101318359375,0.0183258056640625,-0.003917694091796875,-0.0107421875,-0.0004620552062988281,-0.0143280029296875,-0.0061492919921875,-0.01544189453125,0.0140533447265625,-0.0103302001953125,-0.01335906982421875,-0.01471710205078125,-0.00782012939453125,0.030670166015625,0.030975341796875,0.00006532669067382812,-0.005725860595703125,-0.0088958740234375,0.0030803680419921875,0.0115203857421875,0.00457000732421875,0.016448974609375,-0.00878143310546875,0.00847625732421875,-0.00328826904296875,0.00835418701171875,-0.01184844970703125,0.011627197265625,-0.01384735107421875,-0.010711669921875,0.0167236328125,-0.130859375,-0.00995635986328125,-0.01207733154296875,0.017730712890625,0.0138092041015625,-0.0101165771484375,-0.00844573974609375,-0.01971435546875,-0.020233154296875,0.006465911865234375,-0.0255889892578125,-0.01788330078125,0.00270843505859375,-0.0081634521484375,0.007904052734375,0.00217437744140625,0.006542205810546875,-0.0269927978515625,-0.021453857421875,0.0019025802612304688,0.0256805419921875,-0.0151519775390625,-0.00382232666015625,-0.00569915771484375,-0.00005257129669189453,-0.0143280029296875,-0.005008697509765625,-0.011016845703125,0.016845703125,0.007518768310546875,0.01520538330078125,0.01224517822265625,-0.00347137451171875,-0.01349639892578125,0.0157928466796875,0.015869140625,0.0216827392578125,0.020050048828125,-0.0012607574462890625,-0.0147857666015625,0.002193450927734375,-0.0343017578125,-0.0105743408203125,0.01010894775390625,-0.01018524169921875,0.030975341796875,0.0157318115234375,-0.030029296875,-0.0004382133483886719,-0.027923583984375,-0.0008397102355957031,0.01491546630859375,-0.00440216064453125,0.0196075439453125,-0.00336456298828125,0.0041961669921875,0.00896453857421875,-0.01483154296875,-0.0283050537109375,-0.0286865234375,0.0136260986328125,0.005855560302734375,0.00803375244140625,0.0066986083984375,-0.0267181396484375,0.0106658935546875,-0.00820159912109375,0.0027980804443359375,-0.045166015625,0.01409912109375,-0.0185394287109375,0.0149993896484375,-0.01386260986328125,-0.00826263427734375,-0.0210418701171875,-0.003833770751953125,-0.019134521484375,-0.0255279541015625,0.006229400634765625,0.00960540771484375,0.016357421875,-0.01064300537109375,-0.00749969482421875,-0.026031494140625,-0.013763427734375,0.0040435791015625,-0.0078582763671875,-0.00382232666015625,0.008697509765625,-0.00911712646484375,0.0100250244140625,0.01509857177734375,0.0015573501586914062,-0.00021159648895263672,-0.00002962350845336914,-0.00981903076171875,-0.01763916015625,-0.016937255859375,0.00322723388671875,-0.0031528472900390625,0.0174102783203125,-0.0141448974609375,-0.00119781494140625,-0.0157470703125,0.0005621910095214844,-0.01100921630859375,-0.0101165771484375,-0.026092529296875,0.0208282470703125,-0.0142059326171875,0.009063720703125,0.0022735595703125,-0.003154754638671875,-0.01568603515625,-0.01320648193359375,0.05987548828125,-0.0631103515625,0.07958984375,-0.0032444000244140625,0.0022068023681640625,0.00960540771484375,-0.00881195068359375,0.00307464599609375,-0.00328826904296875,0.004024505615234375,-0.0175018310546875,-0.034637451171875,-0.01873779296875,0.021392822265625,0.0140838623046875,0.006580352783203125,0.012542724609375,0.00496673583984375,0.0104522705078125,0.0029582977294921875,-0.01535797119140625,-0.007778167724609375,-0.00893402099609375,0.0114898681640625,0.0233306884765625,-0.0207061767578125,0.01038360595703125,0.073486328125,-0.006755828857421875,-0.0019130706787109375,0.01336669921875,-0.015899658203125,0.044219970703125,-0.00968170166015625,0.07232666015625,0.0093536376953125,-0.0224761962890625,-0.0154876708984375,0.017333984375,0.032806396484375,-0.005519866943359375,0.004985809326171875,0.0018520355224609375,-0.030242919921875,-0.002044677734375,-0.0212860107421875,0.0144805908203125,-0.01177978515625,-0.022857666015625,-0.01123809814453125,0.005397796630859375,-0.0018901824951171875,-0.006137847900390625,0.00763702392578125,-0.0208587646484375,0.00021409988403320312,-0.008819580078125,0.01129913330078125,0.0009546279907226562,0.016510009765625,-0.004695892333984375,0.0016231536865234375,-0.032440185546875,0.006435394287109375,-0.047698974609375,-0.019073486328125,-0.01080322265625,-0.0157318115234375,-0.01025390625,-0.0372314453125,-0.029296875,0.01226806640625,-0.0030612945556640625,0.029052734375,-0.00983428955078125,0.02117919921875,-0.007518768310546875,0.0251617431640625,-0.005794525146484375,0.0110321044921875,0.0048370361328125,0.0024738311767578125,-0.002044677734375,0.01904296875,0.0011234283447265625,0.005859375,0.0166168212890625,-0.0217132568359375,-0.06707763671875,0.004680633544921875,-0.0113372802734375,-0.00799560546875,0.1741943359375,-0.00583648681640625,-0.012420654296875,0.0298004150390625,-0.0012760162353515625,-0.03143310546875,0.01531982421875,0.00775146484375,-0.03448486328125,0.00604248046875,0.0139312744140625,-0.00919342041015625,-0.016998291015625,0.03411865234375,0.06072998046875,-0.00179290771484375,-0.00916290283203125,0.037841796875,0.01389312744140625,0.0025730133056640625,-0.003986358642578125,0.0288848876953125,0.0154876708984375,-0.0042877197265625,-0.0012311935424804688,0.00948333740234375,0.0180816650390625,-0.016021728515625,0.0186004638671875,0.004665374755859375,-0.0262451171875,-0.01068878173828125,0.0026416778564453125,0.01355743408203125,-0.005222320556640625,0.01523590087890625,-0.0247955322265625,-0.00821685791015625,0.0010423660278320312,-0.0146942138671875,0.00860595703125,0.01010894775390625,0.062744140625,0.016876220703125,0.0093536376953125,0.0167694091796875,-0.01708984375,0.010345458984375,-0.008575439453125,-0.00531005859375,-0.0024623870849609375,0.014373779296875,0.0082550048828125,-0.00487518310546875,-0.01171875,-0.01456451416015625,0.0369873046875,-0.007110595703125,0.012359619140625,-0.006275177001953125,-0.00970458984375,0.006626129150390625,-0.0256500244140625,0.001338958740234375,-0.0111846923828125,0.00896453857421875,0.007709503173828125,0.0140228271484375,-0.01473236083984375,0.0107421875,0.03375244140625,0.0045623779296875,0.027618408203125,-0.0175628662109375,0.003963470458984375,-0.002826690673828125,-0.0095367431640625,-0.0235137939453125,-0.0115966796875,-0.01119232177734375,0.003993988037109375,-0.00884246826171875,-0.011505126953125,0.01490020751953125,0.08087158203125,-0.053497314453125,0.0026378631591796875,0.0170745849609375,-0.007305145263671875,-0.005039215087890625,-0.03314208984375,0.0025844573974609375,0.0082855224609375,0.019012451171875,-0.017730712890625,-0.0229949951171875,-0.01219940185546875,0.0037212371826171875,0.002349853515625,0.007457733154296875,-0.005634307861328125,-0.0034122467041015625,0.02728271484375,0.00031065940856933594,0.021514892578125,-0.02313232421875,0.0008296966552734375,-0.0133514404296875,0.0160064697265625,0.0018186569213867188,-0.01114654541015625,0.00400543212890625,-0.0016326904296875,-0.014617919921875,0.007137298583984375,0.0050811767578125,0.0073089599609375,0.0169677734375,0.0291900634765625,-0.0310211181640625,-0.03143310546875,0.012603759765625,0.010101318359375,-0.0219573974609375,0.01360321044921875,0.0168304443359375,-0.07977294921875,-0.03668212890625,-0.020111083984375,-0.0107269287109375,-0.0184326171875,-0.00342559814453125,-0.0036945343017578125,-0.00731658935546875,0.0055389404296875,-0.02142333984375,-0.001720428466796875,-0.0192413330078125,0.0943603515625,-0.0054168701171875,0.00843048095703125,-0.264404296875,-0.0184173583984375,-0.00803375244140625,-0.0074005126953125,-0.02520751953125,0.0273284912109375,0.007793426513671875,-0.0037326812744140625,0.0160980224609375,0.0311431884765625,-0.01248931884765625,-0.01308441162109375,0.009246826171875,0.068603515625,0.015869140625,0.0404052734375,-0.0243988037109375,0.0289154052734375,0.01427459716796875,-0.00783538818359375,-0.001354217529296875,-0.035888671875,0.0015106201171875,0.0005698204040527344,-0.0015630722045898438,0.01151275634765625,0.0146636962890625,-0.0091094970703125,-0.019683837890625,0.0232391357421875,-0.0011415481567382812,-0.0003418922424316406,0.005847930908203125,-0.00783538818359375,-0.009613037109375,0.0132904052734375,0.032012939453125,-0.0145263671875,0.01190948486328125,-0.004970550537109375,-0.0252838134765625,0.0008139610290527344,0.01488494873046875,0.0159759521484375,0.0102081298828125,-0.0081787109375,-0.0018749237060546875,-0.005767822265625,0.008514404296875,0.0108795166015625,-0.0218353271484375,-0.004512786865234375,-0.0038089752197265625,-0.00276947021484375,0.01360321044921875,-0.003948211669921875,-0.0021457672119140625,-0.01064300537109375,-0.031646728515625,-0.006450653076171875,-0.00580596923828125,-0.0018367767333984375,-0.0281524658203125,-0.002849578857421875,0.009490966796875,0.0030574798583984375,-0.01264190673828125,-0.0177001953125,-0.010711669921875,-0.0406494140625,-0.003753662109375,-0.0075531005859375,0.01557159423828125,-0.0055999755859375,-0.01361083984375,0.0171966552734375,0.007373809814453125,-0.029632568359375,0.0014982223510742188,-0.02166748046875,-0.0291595458984375,-0.00521087646484375,-0.00939178466796875,0.0040435791015625,-0.030242919921875,0.005840301513671875,0.01471710205078125,0.0054473876953125,-0.0114898681640625,-0.0174560546875,-0.0009918212890625,0.01454925537109375,0.01412200927734375,-0.0161895751953125,-0.0013570785522460938,-0.0007658004760742188,0.00811767578125,0.014129638671875,-0.0064544677734375,0.01256561279296875,0.01558685302734375,-0.0039825439453125,0.001056671142578125,0.036376953125,0.018951416015625,0.01445770263671875,0.0022106170654296875,-0.036468505859375,-0.015960693359375,-0.0038051605224609375,-0.00328826904296875,-0.00919342041015625,-0.0125274658203125,-0.00838470458984375,-0.09625244140625,0.0611572265625,0.00679779052734375,0.022216796875,0.00952911376953125,-0.019622802734375,-0.0036029815673828125,-0.0210418701171875,0.057037353515625,-0.002460479736328125,0.01483154296875,-0.05078125,-0.0010251998901367188,-0.006542205810546875,-0.0186004638671875,0.01141357421875,0.001983642578125,-0.0023708343505859375,0.00809478759765625,-0.0070343017578125,0.00958251953125,0.0007519721984863281,0.01812744140625,-0.01300811767578125,0.00264739990234375,0.003070831298828125,0.00879669189453125,-0.030975341796875,0.02166748046875,0.004413604736328125,-0.02423095703125,-0.0133056640625,0.03619384765625,-0.0625,0.0294036865234375,0.01457977294921875,0.00698089599609375,-0.0151519775390625,-0.00540924072265625,0.006259918212890625,-0.0072479248046875,-0.01280975341796875,-0.0017652511596679688,0.01171875,0.002513885498046875,0.01097869873046875,-0.0247650146484375,0.005588531494140625,-0.0128326416015625,-0.00829315185546875,-0.016632080078125,-0.010101318359375,0.0249176025390625,0.01557159423828125,-0.01442718505859375,-0.1650390625,0.02264404296875,0.034454345703125,-0.025970458984375,-0.0060577392578125,0.0167236328125,0.01123046875,-0.006450653076171875,-0.008453369140625,0.00759124755859375,0.01488494873046875,-0.0159759521484375,-0.0024089813232421875,0.0280914306640625,0.0010623931884765625,0.048248291015625,-0.011444091796875,-0.030364990234375,-0.034759521484375,0.038848876953125,0.003971099853515625,-0.0019464492797851562,-0.01506805419921875,-0.0055389404296875,0.01262664794921875,0.0187530517578125,0.0301361083984375,-0.0168609619140625,0.0040740966796875,0.0266876220703125,-0.0113677978515625,-0.005260467529296875,0.01035308837890625,-0.001186370849609375,-0.0633544921875,0.0121917724609375,-0.01078033447265625,-0.014312744140625,-0.00982666015625,0.0219879150390625,0.005878448486328125,0.010467529296875,-0.0157928466796875,-0.006916046142578125,-0.008544921875,-0.048492431640625,0.01114654541015625,0.0081329345703125,0.0083465576171875,0.1513671875,0.0082855224609375,0.021881103515625,0.00179290771484375,-0.00545501708984375,-0.004100799560546875,-0.01178741455078125,0.022735595703125,0.0045928955078125,0.029052734375,-0.0301055908203125,-0.002918243408203125,-0.0002028942108154297,0.0255584716796875,0.021484375,0.01328277587890625,0.0065765380859375,0.00424957275390625,-0.0143585205078125,0.0185546875,-0.01312255859375,-0.00008237361907958984,0.00905609130859375,-0.01495361328125,-0.006160736083984375,0.0173797607421875,-0.0196990966796875,-0.004688262939453125,-0.00717926025390625,-0.02618408203125,0.00463104248046875,0.005992889404296875,0.006969451904296875,-0.017486572265625,-0.01253509521484375,0.032135009765625,-0.01678466796875,-0.016815185546875,-0.0102386474609375,0.0168304443359375,-0.032806396484375,0.003520965576171875,0.0091705322265625,-0.012298583984375,-0.0009250640869140625,-0.0168304443359375,0.03375244140625,-0.0029964447021484375,0.040618896484375,-0.0125579833984375,0.0062255859375,-0.03155517578125,-0.003971099853515625,-0.0362548828125,-0.06927490234375,0.018402099609375,0.0028591156005859375,-0.01276397705078125,-0.015838623046875,0.041412353515625,0.0025348663330078125,-0.0333251953125,0.0013742446899414062,-0.00029540061950683594,-0.00513458251953125,-0.01788330078125,0.00040984153747558594,-0.02081298828125,0.0074310302734375,-0.0006952285766601562,0.0003848075866699219,0.040802001953125,-0.0005335807800292969,0.00914764404296875,-0.00676727294921875,0.0017271041870117188,-0.0130462646484375,-0.00298309326171875,-0.0208587646484375,-0.01114654541015625,-0.00913238525390625,-0.00946044921875,-0.006816864013671875,-0.029571533203125,-0.006687164306640625,-0.006587982177734375,0.0086669921875,0.057373046875,0.01175689697265625,-0.036529541015625,0.03570556640625,0.0031948089599609375,-0.01346588134765625,-0.0020389556884765625,-0.018707275390625,0.0084228515625,-0.0139312744140625,-0.002338409423828125,0.0009918212890625,0.02642822265625,-0.010772705078125,-0.012786865234375,0.00013875961303710938,0.015777587890625,-0.00018644332885742188,0.01367950439453125,0.00301361083984375,-0.01519012451171875,0.01824951171875,0.0000998377799987793,0.039459228515625,-0.018280029296875,0.0023021697998046875,0.0156097412109375,0.0148468017578125,-0.01153564453125,-0.0301666259765625,-0.040252685546875,-0.01483154296875,0.0016994476318359375,0.0007262229919433594,0.0215301513671875,0.0113525390625,0.0206298828125,0.0295562744140625,0.006908416748046875,-0.0196990966796875,0.02166748046875,-0.0009050369262695312,0.001377105712890625,-0.01065826416015625,0.032012939453125,-0.0014181137084960938,0.0146484375,-0.0086822509765625,-0.004123687744140625,0.005970001220703125,0.0009703636169433594,-0.0166015625,-0.0172882080078125,0.0010013580322265625,-0.024749755859375,0.01308441162109375,-0.0054779052734375,-0.002838134765625,-0.004390716552734375,0.0029296875,0.01456451416015625,0.0013513565063476562,0.00873565673828125,0.0034637451171875,0.02874755859375,-0.0077972412109375,0.0102691650390625,0.01541900634765625,-0.0000928044319152832,-0.0017824172973632812,-0.01099395751953125,-0.01019287109375,-0.01285552978515625,0.00469207763671875,0.014495849609375,0.01136016845703125,0.0141448974609375,-0.0018520355224609375,-0.022491455078125,-0.005657196044921875,0.00620269775390625,-0.01445770263671875,-0.01068115234375,0.0208892822265625,0.01031494140625,0.0200653076171875,-0.01180267333984375,-0.0178985595703125,0.00159454345703125,0.008544921875,0.03289794921875,0.00017774105072021484,0.02716064453125,-0.01218414306640625,-0.02520751953125,-0.0047149658203125,0.01052093505859375,-0.01308441162109375,-0.0113983154296875,-0.0030498504638671875,-0.0015840530395507812,0.011962890625,0.0208740234375,0.01195526123046875,0.0643310546875,-0.01015472412109375,-0.00350189208984375,0.00740814208984375,-0.0152740478515625,-0.0294952392578125,-0.0006608963012695312,-0.0218658447265625,-0.005035400390625,0.0012979507446289062,-0.023956298828125,-0.01470184326171875,-0.00748443603515625,-0.00994110107421875,-0.01454925537109375,0.0467529296875,-0.0172882080078125,0.0015039443969726562,0.0158538818359375,-0.007965087890625,-0.0200042724609375,-0.00658416748046875,-0.0093994140625,0.0023746490478515625,-0.0188446044921875,0.0285491943359375,-0.019134521484375,0.00930023193359375,-0.0019359588623046875,0.01143646240234375,-0.03448486328125,-0.0033931732177734375,0.0021953582763671875,-0.005527496337890625,0.0006771087646484375,-0.020721435546875,0.03607177734375,-0.01074981689453125,-0.0002053976058959961,-0.00975799560546875,0.0101318359375,0.006504058837890625,-0.01210784912109375,-0.0102691650390625,-0.003978729248046875,0.037506103515625,-0.02032470703125,0.01026153564453125,-0.032470703125,0.017578125,0.021209716796875,0.01971435546875,0.0300445556640625,0.01035308837890625,-0.0181732177734375,-0.003307342529296875,-0.0128936767578125,0.0062408447265625,0.01131439208984375,0.0036411285400390625,-0.003215789794921875,0.00733184814453125,-0.01491546630859375,0.17724609375,-0.0231475830078125,0.020721435546875,-0.0123291015625,-0.0037841796875,0.00936126708984375,-0.002288818359375,-0.019775390625,0.001811981201171875,-0.007678985595703125,0.014678955078125,-0.0105438232421875,0.01525115966796875,-0.0150146484375,-0.012451171875,0.006618499755859375,0.0262603759765625,0.01152801513671875,-0.015472412109375,-0.0207977294921875,-0.004184722900390625,-0.01219940185546875,-0.0089111328125,0.016632080078125,0.01374053955078125,-0.002765655517578125,-0.0189361572265625,0.0291595458984375,0.007007598876953125,-0.0004413127899169922,-0.01189422607421875,-0.0159149169921875,-0.0086212158203125,-0.0196075439453125,0.0037841796875,-0.020172119140625,0.00885772705078125,-0.01302337646484375,-0.0311431884765625,-0.016265869140625,0.01373291015625,0.01259613037109375,0.01325225830078125,0.0182342529296875,-0.0073089599609375,0.006134033203125,-0.01523590087890625,0.01326751708984375,-0.0009407997131347656,0.0013246536254882812,-0.028472900390625,0.00551605224609375,0.0010852813720703125,0.018218994140625,-0.004726409912109375,0.0007076263427734375,0.01013946533203125,-0.005626678466796875,0.0178375244140625,0.0180816650390625,0.0217742919921875,-0.0214385986328125,-0.0279693603515625,0.004016876220703125,0.01371002197265625,0.005886077880859375,-0.006076812744140625,-0.01953125,-0.01348114013671875,0.0156707763671875,0.03424072265625,-0.0091705322265625,0.01535797119140625,0.00560760498046875,-0.002124786376953125,0.003574371337890625,0.0120391845703125,-0.002864837646484375,0.00566864013671875,-0.0065765380859375,-0.0289306640625,-0.01522064208984375,0.0146484375,-0.036346435546875,0.0033473968505859375,-0.004150390625,0.022216796875,0.0232696533203125,0.0011587142944335938,-0.0025424957275390625,-0.021484375,-0.017852783203125,0.0123748779296875,-0.003963470458984375,0.01031494140625,-0.01226806640625,-0.0083770751953125,-0.0045318603515625,0.00004738569259643555,-0.003742218017578125,-0.0034503936767578125,-0.01557159423828125,-0.0009288787841796875,0.01488494873046875,-0.0130157470703125,-0.00121307373046875,0.00994110107421875,-0.0050048828125,-0.0159149169921875,-0.013946533203125,-0.00860595703125,0.0143280029296875,0.00543975830078125,-0.003353118896484375,0.0010385513305664062,0.016845703125,-0.0057830810546875,0.056488037109375,0.0085296630859375,0.0087432861328125,0.01617431640625],"index":0,"object":"embedding"}],"model":"Alibaba-NLP/gte-Qwen2-1.5B-instruct","object":"list","usage":{"prompt_tokens":4,"total_tokens":4,"completion_tokens":0,"prompt_tokens_details":null}}'
Text embedding (first 10): [-0.00023102760314941406, -0.04986572265625, -0.0032711029052734375, 0.011077880859375, -0.0140533447265625, 0.0159912109375, -0.01441192626953125, 0.0059051513671875, -0.0228424072265625, 0.0272979736328125]
Using Python Requests#
[3]:
import requests
text = "Once upon a time"
response = requests.post(
f"http://localhost:{port}/v1/embeddings",
json={"model": "Alibaba-NLP/gte-Qwen2-1.5B-instruct", "input": text},
)
text_embedding = response.json()["data"][0]["embedding"]
print_highlight(f"Text embedding (first 10): {text_embedding[:10]}")
[2025-06-23 19:14:27] Prefill batch. #new-seq: 1, #new-token: 1, #cached-token: 3, token usage: 0.00, #running-req: 0, #queue-req: 0
[2025-06-23 19:14:27] INFO: 127.0.0.1:34172 - "POST /v1/embeddings HTTP/1.1" 200 OK
Text embedding (first 10): [-0.00023102760314941406, -0.04986572265625, -0.0032711029052734375, 0.011077880859375, -0.0140533447265625, 0.0159912109375, -0.01441192626953125, 0.0059051513671875, -0.0228424072265625, 0.0272979736328125]
Using OpenAI Python Client#
[4]:
import openai
client = openai.Client(base_url=f"http://127.0.0.1:{port}/v1", api_key="None")
# Text embedding example
response = client.embeddings.create(
model="Alibaba-NLP/gte-Qwen2-1.5B-instruct",
input=text,
)
embedding = response.data[0].embedding[:10]
print_highlight(f"Text embedding (first 10): {embedding}")
[2025-06-23 19:14:27] Prefill batch. #new-seq: 1, #new-token: 1, #cached-token: 3, token usage: 0.00, #running-req: 0, #queue-req: 0
[2025-06-23 19:14:27] INFO: 127.0.0.1:34178 - "POST /v1/embeddings HTTP/1.1" 200 OK
Text embedding (first 10): [-0.00023102760314941406, -0.04986572265625, -0.0032711029052734375, 0.011077880859375, -0.0140533447265625, 0.0159912109375, -0.01441192626953125, 0.0059051513671875, -0.0228424072265625, 0.0272979736328125]
Using Input IDs#
SGLang also supports input_ids
as input to get the embedding.
[5]:
import json
import os
from transformers import AutoTokenizer
os.environ["TOKENIZERS_PARALLELISM"] = "false"
tokenizer = AutoTokenizer.from_pretrained("Alibaba-NLP/gte-Qwen2-1.5B-instruct")
input_ids = tokenizer.encode(text)
curl_ids = f"""curl -s http://localhost:{port}/v1/embeddings \
-H "Content-Type: application/json" \
-d '{{"model": "Alibaba-NLP/gte-Qwen2-1.5B-instruct", "input": {json.dumps(input_ids)}}}'"""
input_ids_embedding = json.loads(subprocess.check_output(curl_ids, shell=True))["data"][
0
]["embedding"]
print_highlight(f"Input IDs embedding (first 10): {input_ids_embedding[:10]}")
[2025-06-23 19:14:31] Prefill batch. #new-seq: 1, #new-token: 1, #cached-token: 3, token usage: 0.00, #running-req: 0, #queue-req: 0
[2025-06-23 19:14:31] INFO: 127.0.0.1:34184 - "POST /v1/embeddings HTTP/1.1" 200 OK
Input IDs embedding (first 10): [-0.00023102760314941406, -0.04986572265625, -0.0032711029052734375, 0.011077880859375, -0.0140533447265625, 0.0159912109375, -0.01441192626953125, 0.0059051513671875, -0.0228424072265625, 0.0272979736328125]
[6]:
terminate_process(embedding_process)
[2025-06-23 19:14:31] Child process unexpectedly failed with exitcode=9. pid=282815
[2025-06-23 19:14:31] Child process unexpectedly failed with exitcode=9. pid=282748
Multi-Modal Embedding Model#
Please refer to Multi-Modal Embedding Model