Structured Outputs For Reasoning Models#

When working with reasoning models that use special tokens like <think>...</think> to denote reasoning sections, you might want to allow free-form text within these sections while still enforcing grammar constraints on the rest of the output.

SGLang provides a feature to disable grammar restrictions within reasoning sections. This is particularly useful for models that need to perform complex reasoning steps before providing a structured output.

To enable this feature, use the --reasoning-parser flag which decide the think_end_token, such as </think>, when launching the server. You can also specify the reasoning parser using the --reasoning-parser flag.

Supported Models#

Currently, SGLang supports the following reasoning models:

  • DeepSeek R1 series: The reasoning content is wrapped with <think> and </think> tags.

  • QwQ: The reasoning content is wrapped with <think> and </think> tags.

Usage#

OpenAI Compatible API#

Specify the --grammar-backend, --reasoning-parser option.

[1]:
import openai
import os
from sglang.test.test_utils import is_in_ci

if is_in_ci():
    from patch import launch_server_cmd
else:
    from sglang.utils import launch_server_cmd

from sglang.utils import wait_for_server, print_highlight, terminate_process

os.environ["TOKENIZERS_PARALLELISM"] = "false"


server_process, port = launch_server_cmd(
    "python -m sglang.launch_server --model-path deepseek-ai/DeepSeek-R1-Distill-Qwen-7B --host 0.0.0.0 --reasoning-parser deepseek-r1"
)

wait_for_server(f"http://localhost:{port}")
client = openai.Client(base_url=f"http://127.0.0.1:{port}/v1", api_key="None")
[2025-04-13 23:29:25] server_args=ServerArgs(model_path='deepseek-ai/DeepSeek-R1-Distill-Qwen-7B', tokenizer_path='deepseek-ai/DeepSeek-R1-Distill-Qwen-7B', tokenizer_mode='auto', skip_tokenizer_init=False, load_format='auto', trust_remote_code=False, dtype='auto', kv_cache_dtype='auto', quantization=None, quantization_param_path=None, context_length=None, device='cuda', served_model_name='deepseek-ai/DeepSeek-R1-Distill-Qwen-7B', chat_template=None, completion_template=None, is_embedding=False, revision=None, host='0.0.0.0', port=31000, mem_fraction_static=0.88, max_running_requests=200, max_total_tokens=20480, chunked_prefill_size=8192, max_prefill_tokens=16384, schedule_policy='fcfs', schedule_conservativeness=1.0, cpu_offload_gb=0, page_size=1, tp_size=1, stream_interval=1, stream_output=False, random_seed=434282875, constrained_json_whitespace_pattern=None, watchdog_timeout=300, dist_timeout=None, download_dir=None, base_gpu_id=0, gpu_id_step=1, log_level='info', log_level_http=None, log_requests=False, log_requests_level=0, show_time_cost=False, enable_metrics=False, decode_log_interval=40, api_key=None, file_storage_path='sglang_storage', enable_cache_report=False, reasoning_parser='deepseek-r1', dp_size=1, load_balance_method='round_robin', ep_size=1, dist_init_addr=None, nnodes=1, node_rank=0, json_model_override_args='{}', lora_paths=None, max_loras_per_batch=8, lora_backend='triton', attention_backend=None, sampling_backend='flashinfer', grammar_backend='xgrammar', speculative_algorithm=None, speculative_draft_model_path=None, speculative_num_steps=None, speculative_eagle_topk=None, speculative_num_draft_tokens=None, speculative_accept_threshold_single=1.0, speculative_accept_threshold_acc=1.0, speculative_token_map=None, enable_double_sparsity=False, ds_channel_config_path=None, ds_heavy_channel_num=32, ds_heavy_token_num=256, ds_heavy_channel_type='qk', ds_sparse_decode_threshold=4096, disable_radix_cache=False, disable_cuda_graph=True, disable_cuda_graph_padding=False, enable_nccl_nvls=False, disable_outlines_disk_cache=False, disable_custom_all_reduce=False, disable_mla=False, enable_llama4_multimodal=None, disable_overlap_schedule=False, enable_mixed_chunk=False, enable_dp_attention=False, enable_ep_moe=False, enable_deepep_moe=False, deepep_mode='auto', enable_torch_compile=False, torch_compile_max_bs=32, cuda_graph_max_bs=160, cuda_graph_bs=None, torchao_config='', enable_nan_detection=False, enable_p2p_check=False, triton_attention_reduce_in_fp32=False, triton_attention_num_kv_splits=8, num_continuous_decode_steps=1, delete_ckpt_after_loading=False, enable_memory_saver=False, allow_auto_truncate=False, enable_custom_logit_processor=False, tool_call_parser=None, enable_hierarchical_cache=False, hicache_ratio=2.0, enable_flashinfer_mla=False, enable_flashmla=False, flashinfer_mla_disable_ragged=False, warmups=None, n_share_experts_fusion=0, disable_shared_experts_fusion=False, debug_tensor_dump_output_folder=None, debug_tensor_dump_input_file=None, debug_tensor_dump_inject=False, disaggregation_mode='null', disaggregation_bootstrap_port=8998, disaggregation_transfer_backend='mooncake', disable_fast_image_processor=False)
[2025-04-13 23:29:37 TP0] Attention backend not set. Use flashinfer backend by default.
[2025-04-13 23:29:37 TP0] Init torch distributed begin.
[2025-04-13 23:29:38 TP0] Init torch distributed ends. mem usage=0.00 GB
[2025-04-13 23:29:38 TP0] Load weight begin. avail mem=78.58 GB
[2025-04-13 23:29:39 TP0] Ignore import error when loading sglang.srt.models.llama4.
[2025-04-13 23:29:40 TP0] Using model weights format ['*.safetensors']
Loading safetensors checkpoint shards:   0% Completed | 0/2 [00:00<?, ?it/s]
Loading safetensors checkpoint shards:  50% Completed | 1/2 [00:01<00:01,  1.23s/it]
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:02<00:00,  1.22s/it]
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:02<00:00,  1.22s/it]

[2025-04-13 23:29:42 TP0] Load weight end. type=Qwen2ForCausalLM, dtype=torch.bfloat16, avail mem=48.57 GB, mem usage=30.01 GB.
[2025-04-13 23:29:42 TP0] KV Cache is allocated. #tokens: 20480, K size: 0.55 GB, V size: 0.55 GB
[2025-04-13 23:29:42 TP0] Memory pool end. avail mem=47.20 GB
[2025-04-13 23:29:42 TP0]

CUDA Graph is DISABLED.
This will cause significant performance degradation.
CUDA Graph should almost never be disabled in most usage scenarios.
If you encounter OOM issues, please try setting --mem-fraction-static to a lower value (such as 0.8 or 0.7) instead of disabling CUDA Graph.

[2025-04-13 23:29:43 TP0] max_total_num_tokens=20480, chunked_prefill_size=8192, max_prefill_tokens=16384, max_running_requests=200, context_len=131072
[2025-04-13 23:29:43] INFO:     Started server process [2802019]
[2025-04-13 23:29:43] INFO:     Waiting for application startup.
[2025-04-13 23:29:43] INFO:     Application startup complete.
[2025-04-13 23:29:43] INFO:     Uvicorn running on http://0.0.0.0:31000 (Press CTRL+C to quit)
[2025-04-13 23:29:44] INFO:     127.0.0.1:57774 - "GET /v1/models HTTP/1.1" 200 OK
[2025-04-13 23:29:44] INFO:     127.0.0.1:57790 - "GET /get_model_info HTTP/1.1" 200 OK
[2025-04-13 23:29:44 TP0] Prefill batch. #new-seq: 1, #new-token: 7, #cached-token: 0, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-13 23:29:48] INFO:     127.0.0.1:57796 - "POST /generate HTTP/1.1" 200 OK
[2025-04-13 23:29:48] The server is fired up and ready to roll!


NOTE: Typically, the server runs in a separate terminal.
In this notebook, we run the server and notebook code together, so their outputs are combined.
To improve clarity, the server logs are displayed in the original black color, while the notebook outputs are highlighted in blue.
We are running those notebooks in a CI parallel environment, so the throughput is not representative of the actual performance.

JSON#

you can directly define a JSON schema or use Pydantic to define and validate the response.

Using Pydantic

[2]:
from pydantic import BaseModel, Field


# Define the schema using Pydantic
class CapitalInfo(BaseModel):
    name: str = Field(..., pattern=r"^\w+$", description="Name of the capital city")
    population: int = Field(..., description="Population of the capital city")


response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
    messages=[
        {
            "role": "user",
            "content": "Please generate the information of the capital of France in the JSON format.",
        },
    ],
    temperature=0,
    max_tokens=2048,
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "foo",
            # convert the pydantic model to json schema
            "schema": CapitalInfo.model_json_schema(),
        },
    },
)

print_highlight(
    f"reasoing_content: {response.choices[0].message.reasoning_content}\n\ncontent: {response.choices[0].message.content}"
)
[2025-04-13 23:29:49 TP0] Prefill batch. #new-seq: 1, #new-token: 18, #cached-token: 1, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-13 23:29:51 TP0] Decode batch. #running-req: 1, #token: 52, token usage: 0.00, gen throughput (token/s): 5.11, #queue-req: 0,
[2025-04-13 23:29:51 TP0] Decode batch. #running-req: 1, #token: 92, token usage: 0.00, gen throughput (token/s): 98.46, #queue-req: 0,
[2025-04-13 23:29:52 TP0] Decode batch. #running-req: 1, #token: 132, token usage: 0.01, gen throughput (token/s): 103.15, #queue-req: 0,
[2025-04-13 23:29:52 TP0] Decode batch. #running-req: 1, #token: 172, token usage: 0.01, gen throughput (token/s): 102.39, #queue-req: 0,
[2025-04-13 23:29:52 TP0] Decode batch. #running-req: 1, #token: 212, token usage: 0.01, gen throughput (token/s): 100.07, #queue-req: 0,
[2025-04-13 23:29:53 TP0] Decode batch. #running-req: 1, #token: 252, token usage: 0.01, gen throughput (token/s): 100.58, #queue-req: 0,
[2025-04-13 23:29:53 TP0] Decode batch. #running-req: 1, #token: 292, token usage: 0.01, gen throughput (token/s): 103.62, #queue-req: 0,
[2025-04-13 23:29:54 TP0] Decode batch. #running-req: 1, #token: 332, token usage: 0.02, gen throughput (token/s): 99.11, #queue-req: 0,
[2025-04-13 23:29:54 TP0] Decode batch. #running-req: 1, #token: 372, token usage: 0.02, gen throughput (token/s): 102.34, #queue-req: 0,
[2025-04-13 23:29:54 TP0] Decode batch. #running-req: 1, #token: 412, token usage: 0.02, gen throughput (token/s): 101.35, #queue-req: 0,
[2025-04-13 23:29:55 TP0] Decode batch. #running-req: 1, #token: 452, token usage: 0.02, gen throughput (token/s): 99.79, #queue-req: 0,
[2025-04-13 23:29:55 TP0] Decode batch. #running-req: 1, #token: 492, token usage: 0.02, gen throughput (token/s): 97.75, #queue-req: 0,
[2025-04-13 23:29:55 TP0] Decode batch. #running-req: 1, #token: 532, token usage: 0.03, gen throughput (token/s): 104.28, #queue-req: 0,
[2025-04-13 23:29:56 TP0] Decode batch. #running-req: 1, #token: 572, token usage: 0.03, gen throughput (token/s): 100.92, #queue-req: 0,
[2025-04-13 23:29:56 TP0] Decode batch. #running-req: 1, #token: 612, token usage: 0.03, gen throughput (token/s): 102.77, #queue-req: 0,
[2025-04-13 23:29:57 TP0] Decode batch. #running-req: 1, #token: 652, token usage: 0.03, gen throughput (token/s): 98.58, #queue-req: 0,
[2025-04-13 23:29:57 TP0] Decode batch. #running-req: 1, #token: 692, token usage: 0.03, gen throughput (token/s): 100.13, #queue-req: 0,
[2025-04-13 23:29:57 TP0] Decode batch. #running-req: 1, #token: 732, token usage: 0.04, gen throughput (token/s): 102.22, #queue-req: 0,
[2025-04-13 23:29:58 TP0] Decode batch. #running-req: 1, #token: 772, token usage: 0.04, gen throughput (token/s): 101.90, #queue-req: 0,
[2025-04-13 23:29:58 TP0] Decode batch. #running-req: 1, #token: 812, token usage: 0.04, gen throughput (token/s): 103.26, #queue-req: 0,
[2025-04-13 23:29:59 TP0] Decode batch. #running-req: 1, #token: 852, token usage: 0.04, gen throughput (token/s): 99.92, #queue-req: 0,
[2025-04-13 23:29:59 TP0] Decode batch. #running-req: 1, #token: 892, token usage: 0.04, gen throughput (token/s): 99.38, #queue-req: 0,
[2025-04-13 23:29:59 TP0] Decode batch. #running-req: 1, #token: 932, token usage: 0.05, gen throughput (token/s): 103.54, #queue-req: 0,
[2025-04-13 23:30:00 TP0] Decode batch. #running-req: 1, #token: 972, token usage: 0.05, gen throughput (token/s): 96.51, #queue-req: 0,
[2025-04-13 23:30:00 TP0] Decode batch. #running-req: 1, #token: 1012, token usage: 0.05, gen throughput (token/s): 104.18, #queue-req: 0,
[2025-04-13 23:30:01 TP0] Decode batch. #running-req: 1, #token: 1052, token usage: 0.05, gen throughput (token/s): 100.43, #queue-req: 0,
[2025-04-13 23:30:01 TP0] Decode batch. #running-req: 1, #token: 1092, token usage: 0.05, gen throughput (token/s): 101.85, #queue-req: 0,
[2025-04-13 23:30:01 TP0] Decode batch. #running-req: 1, #token: 1132, token usage: 0.06, gen throughput (token/s): 100.34, #queue-req: 0,
[2025-04-13 23:30:02 TP0] Decode batch. #running-req: 1, #token: 1172, token usage: 0.06, gen throughput (token/s): 99.14, #queue-req: 0,
[2025-04-13 23:30:02 TP0] Decode batch. #running-req: 1, #token: 1212, token usage: 0.06, gen throughput (token/s): 103.63, #queue-req: 0,
[2025-04-13 23:30:03 TP0] Decode batch. #running-req: 1, #token: 1252, token usage: 0.06, gen throughput (token/s): 94.66, #queue-req: 0,
[2025-04-13 23:30:03 TP0] Decode batch. #running-req: 1, #token: 1292, token usage: 0.06, gen throughput (token/s): 99.04, #queue-req: 0,
[2025-04-13 23:30:03 TP0] Decode batch. #running-req: 1, #token: 1332, token usage: 0.07, gen throughput (token/s): 102.27, #queue-req: 0,
[2025-04-13 23:30:04 TP0] Decode batch. #running-req: 1, #token: 1372, token usage: 0.07, gen throughput (token/s): 104.77, #queue-req: 0,
[2025-04-13 23:30:04 TP0] Decode batch. #running-req: 1, #token: 1412, token usage: 0.07, gen throughput (token/s): 100.70, #queue-req: 0,
[2025-04-13 23:30:05 TP0] Decode batch. #running-req: 1, #token: 1452, token usage: 0.07, gen throughput (token/s): 102.98, #queue-req: 0,
[2025-04-13 23:30:05 TP0] Decode batch. #running-req: 1, #token: 1492, token usage: 0.07, gen throughput (token/s): 99.10, #queue-req: 0,
[2025-04-13 23:30:05 TP0] Decode batch. #running-req: 1, #token: 1532, token usage: 0.07, gen throughput (token/s): 100.54, #queue-req: 0,
[2025-04-13 23:30:06 TP0] Decode batch. #running-req: 1, #token: 1572, token usage: 0.08, gen throughput (token/s): 84.69, #queue-req: 0,
[2025-04-13 23:30:06 TP0] Decode batch. #running-req: 1, #token: 1612, token usage: 0.08, gen throughput (token/s): 102.36, #queue-req: 0,
[2025-04-13 23:30:07 TP0] Decode batch. #running-req: 1, #token: 1652, token usage: 0.08, gen throughput (token/s): 87.89, #queue-req: 0,
[2025-04-13 23:30:07 TP0] Decode batch. #running-req: 1, #token: 1692, token usage: 0.08, gen throughput (token/s): 91.37, #queue-req: 0,
[2025-04-13 23:30:08 TP0] Decode batch. #running-req: 1, #token: 1732, token usage: 0.08, gen throughput (token/s): 98.13, #queue-req: 0,
[2025-04-13 23:30:08 TP0] Decode batch. #running-req: 1, #token: 1772, token usage: 0.09, gen throughput (token/s): 103.54, #queue-req: 0,
[2025-04-13 23:30:08 TP0] Decode batch. #running-req: 1, #token: 1812, token usage: 0.09, gen throughput (token/s): 103.56, #queue-req: 0,
[2025-04-13 23:30:09 TP0] Decode batch. #running-req: 1, #token: 1852, token usage: 0.09, gen throughput (token/s): 103.86, #queue-req: 0,
[2025-04-13 23:30:09 TP0] Decode batch. #running-req: 1, #token: 1892, token usage: 0.09, gen throughput (token/s): 102.83, #queue-req: 0,
[2025-04-13 23:30:09 TP0] Decode batch. #running-req: 1, #token: 1932, token usage: 0.09, gen throughput (token/s): 105.06, #queue-req: 0,
[2025-04-13 23:30:10 TP0] Decode batch. #running-req: 1, #token: 1972, token usage: 0.10, gen throughput (token/s): 100.52, #queue-req: 0,
[2025-04-13 23:30:10 TP0] Decode batch. #running-req: 1, #token: 2012, token usage: 0.10, gen throughput (token/s): 105.22, #queue-req: 0,
[2025-04-13 23:30:11 TP0] Decode batch. #running-req: 1, #token: 2052, token usage: 0.10, gen throughput (token/s): 94.03, #queue-req: 0,
[2025-04-13 23:30:11] INFO:     127.0.0.1:45322 - "POST /v1/chat/completions HTTP/1.1" 200 OK
reasoing_content: Okay, so I need to generate information about the capital of France, Paris, in JSON format. Hmm, where do I start? Well, I know that Paris is the main city in France, but I'm not exactly sure about all the details. Let me think about what information is typically included about a city's capital.

First, the name of the city is obviously Paris. Then, the country it's the capital of, which is France. I remember that Paris is the largest city in France, but I'm not sure about its population. I think it's a big city, maybe around 2 million people? I'm not certain, though. I should probably look that up, but since I'm just brainstorming, I'll go with that for now.

Next, the location. Paris is located in the northern part of France, specifically in the Île-de-France region. I think it's on the Seine River, which is a major river there. The Eiffel Tower is in Paris, so that's a key landmark. I should include that.

Language-wise, Paris is a center for French culture, so French is definitely the predominant language. I'm not sure about minority languages or dialects, but maybe there are some, but I don't know which ones. I'll just note that French is the main language.

Cuisine is another aspect. Paris is famous for its food, like baguettes, croissants, and wine. I should mention some of the popular dishes, maybe something like "Baguette,Croissant, and wine."

Transportation-wise, Paris has a well-developed public transit system, like the Métro, which is a subway network. I think there are trams and buses too. I should include that the public transport is extensive.

Economically, Paris is a significant hub. I believe it's home to many multinational companies and financial institutions. Maybe something like "home to many multinational companies and financial institutions."

Culturally, Paris is rich in museums, art, and historical sites. The Louvre and the Musée d'Histéritage come to mind. Also, it's a UNESCO World Heritage Site, so I should mention that.

I should also include some notable landmarks. The Eiffel Tower, the Arc de Triomphe, the Louvre, Notre-Dame, and the Sacré-Cœur Basilica. Maybe the gardens around the Eiffel Tower, like Champ de Mars.

Now, putting this all together into JSON format. I need to structure it with the key "capital" and then include all these details as a value. I'll organize them under different categories for clarity, like "name," "country," "population," "location," "language," "cuisine," "transportation," "economy," "culture," and "landmarks."

Wait, I'm not sure about the population figure. I think it's around 2 million, but I'm not 100% certain. Maybe I should double-check that, but for now, I'll proceed with that number.

Also, I'm not sure if Paris is the only city in France that's a capital. I think there are others, but Paris is the main one. So, I should specify that Paris is the main capital.

I should make sure the JSON syntax is correct, with proper commas and quotation marks. Each key should be in quotes, and the entire structure should be an object within the "capital" key.

Putting it all together, I'll structure the JSON with each category as a key, and the corresponding details as values. I'll make sure to include all the points I thought of, but I'll keep it concise and relevant.

I think that's a good start. Now, I'll format it properly in JSON, ensuring that each key is correctly spelled and the values are accurate based on my understanding.


content: {

"name": "Paris",
"population": 20000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

JSON Schema Directly

[3]:
import json

json_schema = json.dumps(
    {
        "type": "object",
        "properties": {
            "name": {"type": "string", "pattern": "^[\\w]+$"},
            "population": {"type": "integer"},
        },
        "required": ["name", "population"],
    }
)

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
    messages=[
        {
            "role": "user",
            "content": "Give me the information of the capital of France in the JSON format.",
        },
    ],
    temperature=0,
    max_tokens=2048,
    response_format={
        "type": "json_schema",
        "json_schema": {"name": "foo", "schema": json.loads(json_schema)},
    },
)

print_highlight(
    f"reasoing_content: {response.choices[0].message.reasoning_content}\n\ncontent: {response.choices[0].message.content}"
)
[2025-04-13 23:30:11 TP0] Prefill batch. #new-seq: 1, #new-token: 17, #cached-token: 2, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-13 23:30:11 TP0] Decode batch. #running-req: 1, #token: 44, token usage: 0.00, gen throughput (token/s): 71.69, #queue-req: 0,
[2025-04-13 23:30:12 TP0] Decode batch. #running-req: 1, #token: 84, token usage: 0.00, gen throughput (token/s): 102.81, #queue-req: 0,
[2025-04-13 23:30:12 TP0] Decode batch. #running-req: 1, #token: 124, token usage: 0.01, gen throughput (token/s): 102.78, #queue-req: 0,
[2025-04-13 23:30:12 TP0] Decode batch. #running-req: 1, #token: 164, token usage: 0.01, gen throughput (token/s): 102.54, #queue-req: 0,
[2025-04-13 23:30:13 TP0] Decode batch. #running-req: 1, #token: 204, token usage: 0.01, gen throughput (token/s): 102.69, #queue-req: 0,
[2025-04-13 23:30:13 TP0] Decode batch. #running-req: 1, #token: 244, token usage: 0.01, gen throughput (token/s): 102.81, #queue-req: 0,
[2025-04-13 23:30:14 TP0] Decode batch. #running-req: 1, #token: 284, token usage: 0.01, gen throughput (token/s): 105.76, #queue-req: 0,
[2025-04-13 23:30:14 TP0] Decode batch. #running-req: 1, #token: 324, token usage: 0.02, gen throughput (token/s): 103.12, #queue-req: 0,
[2025-04-13 23:30:14 TP0] Decode batch. #running-req: 1, #token: 364, token usage: 0.02, gen throughput (token/s): 102.94, #queue-req: 0,
[2025-04-13 23:30:15 TP0] Decode batch. #running-req: 1, #token: 404, token usage: 0.02, gen throughput (token/s): 103.00, #queue-req: 0,
[2025-04-13 23:30:15 TP0] Decode batch. #running-req: 1, #token: 444, token usage: 0.02, gen throughput (token/s): 100.46, #queue-req: 0,
[2025-04-13 23:30:16 TP0] Decode batch. #running-req: 1, #token: 484, token usage: 0.02, gen throughput (token/s): 105.13, #queue-req: 0,
[2025-04-13 23:30:16 TP0] Decode batch. #running-req: 1, #token: 524, token usage: 0.03, gen throughput (token/s): 102.04, #queue-req: 0,
[2025-04-13 23:30:16 TP0] Decode batch. #running-req: 1, #token: 564, token usage: 0.03, gen throughput (token/s): 103.12, #queue-req: 0,
[2025-04-13 23:30:17 TP0] Decode batch. #running-req: 1, #token: 604, token usage: 0.03, gen throughput (token/s): 99.72, #queue-req: 0,
[2025-04-13 23:30:17 TP0] Decode batch. #running-req: 1, #token: 644, token usage: 0.03, gen throughput (token/s): 101.13, #queue-req: 0,
[2025-04-13 23:30:18 TP0] Decode batch. #running-req: 1, #token: 684, token usage: 0.03, gen throughput (token/s): 86.70, #queue-req: 0,
[2025-04-13 23:30:18 TP0] Decode batch. #running-req: 1, #token: 724, token usage: 0.04, gen throughput (token/s): 103.40, #queue-req: 0,
[2025-04-13 23:30:18 TP0] Decode batch. #running-req: 1, #token: 764, token usage: 0.04, gen throughput (token/s): 105.68, #queue-req: 0,
[2025-04-13 23:30:19 TP0] Decode batch. #running-req: 1, #token: 804, token usage: 0.04, gen throughput (token/s): 95.37, #queue-req: 0,
[2025-04-13 23:30:19 TP0] Decode batch. #running-req: 1, #token: 844, token usage: 0.04, gen throughput (token/s): 102.42, #queue-req: 0,
[2025-04-13 23:30:20 TP0] Decode batch. #running-req: 1, #token: 884, token usage: 0.04, gen throughput (token/s): 66.60, #queue-req: 0,
[2025-04-13 23:30:20 TP0] Decode batch. #running-req: 1, #token: 924, token usage: 0.05, gen throughput (token/s): 63.76, #queue-req: 0,
[2025-04-13 23:30:21 TP0] Decode batch. #running-req: 1, #token: 964, token usage: 0.05, gen throughput (token/s): 62.72, #queue-req: 0,
[2025-04-13 23:30:22 TP0] Decode batch. #running-req: 1, #token: 1004, token usage: 0.05, gen throughput (token/s): 73.98, #queue-req: 0,
[2025-04-13 23:30:22 TP0] Decode batch. #running-req: 1, #token: 1044, token usage: 0.05, gen throughput (token/s): 65.97, #queue-req: 0,
[2025-04-13 23:30:23 TP0] Decode batch. #running-req: 1, #token: 1084, token usage: 0.05, gen throughput (token/s): 62.02, #queue-req: 0,
[2025-04-13 23:30:23 TP0] Decode batch. #running-req: 1, #token: 1124, token usage: 0.05, gen throughput (token/s): 61.94, #queue-req: 0,
[2025-04-13 23:30:24 TP0] Decode batch. #running-req: 1, #token: 1164, token usage: 0.06, gen throughput (token/s): 59.80, #queue-req: 0,
[2025-04-13 23:30:25 TP0] Decode batch. #running-req: 1, #token: 1204, token usage: 0.06, gen throughput (token/s): 61.87, #queue-req: 0,
[2025-04-13 23:30:25 TP0] Decode batch. #running-req: 1, #token: 1244, token usage: 0.06, gen throughput (token/s): 62.64, #queue-req: 0,
[2025-04-13 23:30:26 TP0] Decode batch. #running-req: 1, #token: 1284, token usage: 0.06, gen throughput (token/s): 61.69, #queue-req: 0,
[2025-04-13 23:30:27 TP0] Decode batch. #running-req: 1, #token: 1324, token usage: 0.06, gen throughput (token/s): 62.31, #queue-req: 0,
[2025-04-13 23:30:27 TP0] Decode batch. #running-req: 1, #token: 1364, token usage: 0.07, gen throughput (token/s): 62.21, #queue-req: 0,
[2025-04-13 23:30:28 TP0] Decode batch. #running-req: 1, #token: 1404, token usage: 0.07, gen throughput (token/s): 56.44, #queue-req: 0,
[2025-04-13 23:30:29 TP0] Decode batch. #running-req: 1, #token: 1444, token usage: 0.07, gen throughput (token/s): 61.85, #queue-req: 0,
[2025-04-13 23:30:29 TP0] Decode batch. #running-req: 1, #token: 1484, token usage: 0.07, gen throughput (token/s): 70.55, #queue-req: 0,
[2025-04-13 23:30:30 TP0] Decode batch. #running-req: 1, #token: 1524, token usage: 0.07, gen throughput (token/s): 100.12, #queue-req: 0,
[2025-04-13 23:30:30 TP0] Decode batch. #running-req: 1, #token: 1564, token usage: 0.08, gen throughput (token/s): 96.76, #queue-req: 0,
[2025-04-13 23:30:30 TP0] Decode batch. #running-req: 1, #token: 1604, token usage: 0.08, gen throughput (token/s): 100.90, #queue-req: 0,
[2025-04-13 23:30:31 TP0] Decode batch. #running-req: 1, #token: 1644, token usage: 0.08, gen throughput (token/s): 98.79, #queue-req: 0,
[2025-04-13 23:30:31 TP0] Decode batch. #running-req: 1, #token: 1684, token usage: 0.08, gen throughput (token/s): 90.35, #queue-req: 0,
[2025-04-13 23:30:32 TP0] Decode batch. #running-req: 1, #token: 1724, token usage: 0.08, gen throughput (token/s): 95.94, #queue-req: 0,
[2025-04-13 23:30:32 TP0] Decode batch. #running-req: 1, #token: 1764, token usage: 0.09, gen throughput (token/s): 96.76, #queue-req: 0,
[2025-04-13 23:30:33 TP0] Decode batch. #running-req: 1, #token: 1804, token usage: 0.09, gen throughput (token/s): 101.06, #queue-req: 0,
[2025-04-13 23:30:33 TP0] Decode batch. #running-req: 1, #token: 1844, token usage: 0.09, gen throughput (token/s): 97.87, #queue-req: 0,
[2025-04-13 23:30:33 TP0] Decode batch. #running-req: 1, #token: 1884, token usage: 0.09, gen throughput (token/s): 83.23, #queue-req: 0,
[2025-04-13 23:30:34 TP0] Decode batch. #running-req: 1, #token: 1924, token usage: 0.09, gen throughput (token/s): 92.89, #queue-req: 0,
[2025-04-13 23:30:34 TP0] Decode batch. #running-req: 1, #token: 1964, token usage: 0.10, gen throughput (token/s): 69.35, #queue-req: 0,
[2025-04-13 23:30:35 TP0] Decode batch. #running-req: 1, #token: 2004, token usage: 0.10, gen throughput (token/s): 62.73, #queue-req: 0,
[2025-04-13 23:30:36 TP0] Decode batch. #running-req: 1, #token: 2044, token usage: 0.10, gen throughput (token/s): 62.42, #queue-req: 0,
[2025-04-13 23:30:36] INFO:     127.0.0.1:45322 - "POST /v1/chat/completions HTTP/1.1" 200 OK
reasoing_content: Okay, so I need to figure out the information about the capital of France and present it in JSON format. Hmm, let's start by recalling what I know about France. I know that Paris is the capital, but I should double-check that to be sure. Yeah, Paris is definitely the administrative and cultural center of France.

Now, I need to gather more details about Paris. Let me think about its location. Paris is located in the northern part of France, right? It's in the Île-de-France region. I remember that it's an island, so that's an important geographical feature. The coordinates are something like 48°51′N 2°28′E. I think that's correct, but I should make sure. Maybe I can visualize it on a map—Paris is in the north, near the Eiffel Tower and the Seine River.

Next, I should consider the history of Paris. It's been a major city for centuries. I know that during the French Revolution, Paris was the center of the revolution, and it's still a significant cultural and political hub today. It's also a global city, attracting millions of visitors each year.

Economically, Paris is a major financial and business center. I believe it's home to many multinational corporations and financial institutions. The Paris Stock Exchange comes to mind, and there are a lot of big companies based there. The city is also known for its fashion industry, with some of the world's most famous brands having headquarters in Paris.

Culturally, Paris is rich with art, literature, and history. The Louvre Museum is one of the world's largest art museums, and it's located in Paris. There's also the Musée d'Orsay, which is another major cultural institution. The city has a vibrant nightlife, with famous bars and clubs. I think places like Le Marais and Montmartre are known for their lively nightlife and artistic scenes.

Demographically, Paris is a large city, but it's also one of the most densely populated in the world. The metropolitan area covers a vast area, extending beyond the city limits into the outer suburbs. The population is diverse, with people from many different countries and backgrounds living there.

Transportation-wise, Paris has an extensive public transit system, including the Métro, which is a highly efficient subway network. There are also buses, trams, and trolleys that serve the city. The Seine River has several bridges and ferries that connect different parts of the city.

I should also mention some notable landmarks. The Eiffel Tower is a must-see, and the Arc de Triomphe is another iconic structure. The Notre-Dame Cathedral is a major religious site, though it's had some recent renovations and changes due to the fire that occurred in 2019.

In terms of language, Paris is a city where French is the official language, but it's also a multicultural city. People from around the world speak English, Spanish, German, and other languages there, contributing to a very international atmosphere.

I think I've covered the main points. Now, I need to structure this information into a JSON format. I'll start with the basic information: name, location, and status. Then, I'll add historical, economic, cultural, demographic, transportation, and notable landmarks sections. Each section will have relevant details under subkeys.

Wait, I should make sure that the JSON syntax is correct. I'll use proper braces, commas, and quotation marks. Also, I'll ensure that the keys are in lowercase letters as per JSON standards. Let me organize the information step by step.

First, the capital object will have properties like name, location, and status. Then, each additional section will be an object with its own properties. I'll make sure to include all the key details I thought of earlier, making sure each is accurate and concise.

I think that's a solid plan. Now, I'll put it all together into the JSON structure, double-checking for any missing or incorrect information. Once done, I'll review it to ensure it's well-organized and accurately represents the information about Paris as the capital of France.


content: {

"name": "Paris",
"population": 2153000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

EBNF#

[4]:
ebnf_grammar = """
root ::= city | description
city ::= "London" | "Paris" | "Berlin" | "Rome"
description ::= city " is " status
status ::= "the capital of " country
country ::= "England" | "France" | "Germany" | "Italy"
"""

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
    messages=[
        {"role": "system", "content": "You are a helpful geography bot."},
        {
            "role": "user",
            "content": "Give me the information of the capital of France.",
        },
    ],
    temperature=0,
    max_tokens=2048,
    extra_body={"ebnf": ebnf_grammar},
)

print_highlight(
    f"reasoing_content: {response.choices[0].message.reasoning_content}\n\ncontent: {response.choices[0].message.content}"
)
[2025-04-13 23:30:36 TP0] Prefill batch. #new-seq: 1, #new-token: 21, #cached-token: 1, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-13 23:30:36 TP0] Decode batch. #running-req: 1, #token: 39, token usage: 0.00, gen throughput (token/s): 60.82, #queue-req: 0,
[2025-04-13 23:30:37 TP0] Decode batch. #running-req: 1, #token: 79, token usage: 0.00, gen throughput (token/s): 103.13, #queue-req: 0,
[2025-04-13 23:30:37 TP0] Decode batch. #running-req: 1, #token: 119, token usage: 0.01, gen throughput (token/s): 103.77, #queue-req: 0,
[2025-04-13 23:30:38 TP0] Decode batch. #running-req: 1, #token: 159, token usage: 0.01, gen throughput (token/s): 103.83, #queue-req: 0,
[2025-04-13 23:30:38] INFO:     127.0.0.1:45322 - "POST /v1/chat/completions HTTP/1.1" 200 OK
reasoing_content: Okay, so I need to figure out the capital of France. Hmm, I remember that France is a country in Europe, right? I think Paris is the capital because I've heard it mentioned a lot. But wait, I'm not 100% sure. Let me think about other capitals I know. Germany's capital is Berlin, Italy's is Rome, Spain's is Madrid, and the UK's is London. Yeah, Paris fits in there as the capital of France. I don't think there's any other city that's as prominent as Paris when it comes to being the country's main administrative center. Plus, I've heard people talk about the Eiffel Tower and the Louvre, which are both in Paris. So, I'm pretty confident that Paris is the correct answer.


content: Paris is the capital of France

Regular expression#

[5]:
response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    temperature=0,
    max_tokens=2048,
    extra_body={"regex": "(Paris|London)"},
)

print_highlight(
    f"reasoing_content: {response.choices[0].message.reasoning_content}\n\ncontent: {response.choices[0].message.content}"
)
[2025-04-13 23:30:38 TP0] Prefill batch. #new-seq: 1, #new-token: 10, #cached-token: 2, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-13 23:30:38 TP0] Decode batch. #running-req: 1, #token: 17, token usage: 0.00, gen throughput (token/s): 89.53, #queue-req: 0,
[2025-04-13 23:30:38 TP0] Decode batch. #running-req: 1, #token: 57, token usage: 0.00, gen throughput (token/s): 98.43, #queue-req: 0,
[2025-04-13 23:30:39 TP0] Decode batch. #running-req: 1, #token: 97, token usage: 0.00, gen throughput (token/s): 103.23, #queue-req: 0,
[2025-04-13 23:30:39 TP0] Decode batch. #running-req: 1, #token: 137, token usage: 0.01, gen throughput (token/s): 99.28, #queue-req: 0,
[2025-04-13 23:30:39] INFO:     127.0.0.1:45322 - "POST /v1/chat/completions HTTP/1.1" 200 OK
reasoing_content: Okay, so I need to figure out the capital of France. Hmm, I remember that France is a country in Europe, right? I think Paris is the capital because I've heard it mentioned a lot, especially in movies and TV shows. But wait, I'm not entirely sure. Maybe I should think about other capitals of countries I know. For example, Germany's capital is Berlin, Italy's is Rome, Spain's is Madrid. So, following that pattern, France's capital should be Paris. I don't think it's another city like Lille or Nice because those are more known for their cities or natural beauty rather than being the official capital. Yeah, I'm pretty confident that Paris is the correct answer.


content: Paris

Structural Tag#

[6]:
tool_get_current_weather = {
    "type": "function",
    "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The city to find the weather for, e.g. 'San Francisco'",
                },
                "state": {
                    "type": "string",
                    "description": "the two-letter abbreviation for the state that the city is"
                    " in, e.g. 'CA' which would mean 'California'",
                },
                "unit": {
                    "type": "string",
                    "description": "The unit to fetch the temperature in",
                    "enum": ["celsius", "fahrenheit"],
                },
            },
            "required": ["city", "state", "unit"],
        },
    },
}

tool_get_current_date = {
    "type": "function",
    "function": {
        "name": "get_current_date",
        "description": "Get the current date and time for a given timezone",
        "parameters": {
            "type": "object",
            "properties": {
                "timezone": {
                    "type": "string",
                    "description": "The timezone to fetch the current date and time for, e.g. 'America/New_York'",
                }
            },
            "required": ["timezone"],
        },
    },
}

schema_get_current_weather = tool_get_current_weather["function"]["parameters"]
schema_get_current_date = tool_get_current_date["function"]["parameters"]


def get_messages():
    return [
        {
            "role": "system",
            "content": f"""
# Tool Instructions
- Always execute python code in messages that you share.
- When looking for real time information use relevant functions if available else fallback to brave_search
You have access to the following functions:
Use the function 'get_current_weather' to: Get the current weather in a given location
{tool_get_current_weather["function"]}
Use the function 'get_current_date' to: Get the current date and time for a given timezone
{tool_get_current_date["function"]}
If a you choose to call a function ONLY reply in the following format:
<{{start_tag}}={{function_name}}>{{parameters}}{{end_tag}}
where
start_tag => `<function`
parameters => a JSON dict with the function argument name as key and function argument value as value.
end_tag => `</function>`
Here is an example,
<function=example_function_name>{{"example_name": "example_value"}}</function>
Reminder:
- Function calls MUST follow the specified format
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line
- Always add your sources when using search results to answer the user query
You are a helpful assistant.""",
        },
        {
            "role": "user",
            "content": "You are in New York. Please get the current date and time, and the weather.",
        },
    ]


messages = get_messages()

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
    messages=messages,
    response_format={
        "type": "structural_tag",
        "max_new_tokens": 2048,
        "structures": [
            {
                "begin": "<function=get_current_weather>",
                "schema": schema_get_current_weather,
                "end": "</function>",
            },
            {
                "begin": "<function=get_current_date>",
                "schema": schema_get_current_date,
                "end": "</function>",
            },
        ],
        "triggers": ["<function="],
    },
)

print_highlight(
    f"reasoing_content: {response.choices[0].message.reasoning_content}\n\ncontent: {response.choices[0].message.content}"
)
[2025-04-13 23:30:40 TP0] Prefill batch. #new-seq: 1, #new-token: 471, #cached-token: 1, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-13 23:30:40 TP0] Decode batch. #running-req: 1, #token: 488, token usage: 0.02, gen throughput (token/s): 51.49, #queue-req: 0,
[2025-04-13 23:30:40 TP0] Decode batch. #running-req: 1, #token: 528, token usage: 0.03, gen throughput (token/s): 90.32, #queue-req: 0,
[2025-04-13 23:30:41 TP0] Decode batch. #running-req: 1, #token: 568, token usage: 0.03, gen throughput (token/s): 98.98, #queue-req: 0,
[2025-04-13 23:30:41 TP0] Decode batch. #running-req: 1, #token: 608, token usage: 0.03, gen throughput (token/s): 98.58, #queue-req: 0,
[2025-04-13 23:30:42 TP0] Decode batch. #running-req: 1, #token: 648, token usage: 0.03, gen throughput (token/s): 102.57, #queue-req: 0,
[2025-04-13 23:30:42 TP0] Decode batch. #running-req: 1, #token: 688, token usage: 0.03, gen throughput (token/s): 101.49, #queue-req: 0,
[2025-04-13 23:30:42 TP0] Decode batch. #running-req: 1, #token: 728, token usage: 0.04, gen throughput (token/s): 99.40, #queue-req: 0,
[2025-04-13 23:30:43] INFO:     127.0.0.1:45322 - "POST /v1/chat/completions HTTP/1.1" 200 OK
reasoing_content: Alright, I need to figure out how to respond to the user's request. They mentioned being in New York and want the current date and time along with the weather.

First, I should determine which functions to use. The user mentioned two functions: 'get_current_weather' and 'get_current_date'.

For the date and time, I'll call 'get_current_date' with the timezone parameter set to 'America/New_York'. That should provide the correct datetime.

Next, for the weather, I'll use 'get_current_weather'. The city is New York, the state is NY, and the unit is likely Fahrenheit since the user didn't specify otherwise.

I need to make sure each function call is properly formatted with the required parameters. Also, I should include the source in the response, probably citing the tools as per the instructions.

Putting it all together, I'll format each function call correctly, include the parameters, and add the sources at the end.


content: {"timezone": "America/New_York"}
{"city": "New York", "state": "NY", "unit": "fahrenheit"}

Sources:
- get_current_date: https://real-time-dataools.com/
- get_current_weather: https://weather APIs documentation

Native API and SGLang Runtime (SRT)#

JSON#

Using Pydantic

[7]:
import requests
from pydantic import BaseModel, Field
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-7B")


# Define the schema using Pydantic
class CapitalInfo(BaseModel):
    name: str = Field(..., pattern=r"^\w+$", description="Name of the capital city")
    population: int = Field(..., description="Population of the capital city")


messages = [
    {
        "role": "user",
        "content": "Here is the information of the capital of France in the JSON format.\n",
    }
]
text = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
# Make API request
response = requests.post(
    f"http://localhost:{port}/generate",
    json={
        "text": text,
        "sampling_params": {
            "temperature": 0,
            "max_new_tokens": 2048,
            "json_schema": json.dumps(CapitalInfo.model_json_schema()),
        },
    },
)
print(response.json())


reasoing_content = response.json()["text"].split("</think>")[0]
content = json.loads(response.json()["text"].split("</think>")[1])
print_highlight(f"reasoing_content: {reasoing_content}\n\ncontent: {content}")
[2025-04-13 23:30:43 TP0] Prefill batch. #new-seq: 1, #new-token: 19, #cached-token: 1, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-13 23:30:43 TP0] Decode batch. #running-req: 1, #token: 46, token usage: 0.00, gen throughput (token/s): 44.83, #queue-req: 0,
[2025-04-13 23:30:44 TP0] Decode batch. #running-req: 1, #token: 86, token usage: 0.00, gen throughput (token/s): 105.35, #queue-req: 0,
[2025-04-13 23:30:44 TP0] Decode batch. #running-req: 1, #token: 126, token usage: 0.01, gen throughput (token/s): 100.76, #queue-req: 0,
[2025-04-13 23:30:44 TP0] Decode batch. #running-req: 1, #token: 166, token usage: 0.01, gen throughput (token/s): 104.84, #queue-req: 0,
[2025-04-13 23:30:45 TP0] Decode batch. #running-req: 1, #token: 206, token usage: 0.01, gen throughput (token/s): 100.12, #queue-req: 0,
[2025-04-13 23:30:45 TP0] Decode batch. #running-req: 1, #token: 246, token usage: 0.01, gen throughput (token/s): 105.50, #queue-req: 0,
[2025-04-13 23:30:46 TP0] Decode batch. #running-req: 1, #token: 286, token usage: 0.01, gen throughput (token/s): 103.17, #queue-req: 0,
[2025-04-13 23:30:46 TP0] Decode batch. #running-req: 1, #token: 326, token usage: 0.02, gen throughput (token/s): 100.65, #queue-req: 0,
[2025-04-13 23:30:46 TP0] Decode batch. #running-req: 1, #token: 366, token usage: 0.02, gen throughput (token/s): 104.96, #queue-req: 0,
[2025-04-13 23:30:47 TP0] Decode batch. #running-req: 1, #token: 406, token usage: 0.02, gen throughput (token/s): 100.45, #queue-req: 0,
[2025-04-13 23:30:47 TP0] Decode batch. #running-req: 1, #token: 446, token usage: 0.02, gen throughput (token/s): 104.75, #queue-req: 0,
[2025-04-13 23:30:48 TP0] Decode batch. #running-req: 1, #token: 486, token usage: 0.02, gen throughput (token/s): 102.43, #queue-req: 0,
[2025-04-13 23:30:48 TP0] Decode batch. #running-req: 1, #token: 526, token usage: 0.03, gen throughput (token/s): 98.28, #queue-req: 0,
[2025-04-13 23:30:48 TP0] Decode batch. #running-req: 1, #token: 566, token usage: 0.03, gen throughput (token/s): 102.12, #queue-req: 0,
[2025-04-13 23:30:49 TP0] Decode batch. #running-req: 1, #token: 606, token usage: 0.03, gen throughput (token/s): 102.93, #queue-req: 0,
[2025-04-13 23:30:49 TP0] Decode batch. #running-req: 1, #token: 646, token usage: 0.03, gen throughput (token/s): 101.92, #queue-req: 0,
[2025-04-13 23:30:50 TP0] Decode batch. #running-req: 1, #token: 686, token usage: 0.03, gen throughput (token/s): 101.84, #queue-req: 0,
[2025-04-13 23:30:50 TP0] Decode batch. #running-req: 1, #token: 726, token usage: 0.04, gen throughput (token/s): 88.24, #queue-req: 0,
[2025-04-13 23:30:50] INFO:     127.0.0.1:46982 - "POST /generate HTTP/1.1" 200 OK
{'text': 'Okay, so I need to provide the information about the capital of France in JSON format. Hmm, I\'m not exactly sure where the capital of France is, but I think it\'s Paris. Yeah, I remember hearing that Paris is the capital. Let me think about what details I should include. \n\nFirst, the basic info: country, city, population, and maybe some key landmarks. I know the population is around 2 million, but I\'m not sure of the exact number. I think it\'s approximately 2,165,000. As for landmarks, the Eiffel Tower is a must. The Louvre Museum is another famous spot. The Arc de Triomphe is also iconic. Maybe the Seine River is important too since it\'s a major river in the city.\n\nI should structure this in JSON. So, the main object would have a "country" key pointing to "France," and the "capital" key pointing to "Paris." Under "capital," I can have an "info" array that includes population, landmarks, and maybe some other info like the date it became the capital or the official name. Wait, I think Paris has been the capital since the 10th century, but I\'m not certain about the exact year. I\'ll just put around 978 AD for now.\n\nLet me make sure I\'m not missing anything important. Maybe the official name of the city is "City of Light," which is the official name of Paris. Including that could be helpful. Also, perhaps the area of the city, but I don\'t remember the exact figure. Maybe around 105 square kilometers? I\'m not sure, but I\'ll include it as an estimate.\n\nPutting it all together, the JSON structure would have a top-level object with "country" and "capital." The "capital" would have an "info" array with population, landmarks, area, and official name. I think that covers the essentials. I should double-check some of these numbers to make sure they\'re accurate, but since I\'m just recalling, I\'ll proceed with what I have.\n\nWait, I should also consider if there are any other notable facts. Maybe the number of hospitals or universities? I\'m not sure, but perhaps that\'s too detailed. Keeping it simple with population, landmarks, area, and official name seems sufficient.\n\nSo, to summarize, the JSON would look like this:\n\n{\n  "country": "France",\n  "capital": {\n    "info": [\n      {\n        "key": "population",\n        "value": "2,165,000"\n      },\n      {\n        "key": "landmarks",\n        "value": [\n          "Eiffel Tower",\n          "Louvre Museum",\n          "Arc de Triomphe",\n          "Seine River"\n        ]\n      },\n      {\n        "key": "area",\n        "value": "105 km²"\n      },\n      {\n        "key": "official_name",\n        "value": "City of Light"\n      },\n      {\n        "key": "capital_year",\n        "value": "978 AD"\n      }\n    ]\n  }\n}\n\nI think that\'s a comprehensive yet concise structure. I hope I didn\'t miss any important details, but this should cover the basics about the capital of France.\n</think>{\n\n"name": "Paris",\n"population": 2165000\n}', 'meta_info': {'id': '38bd7f5c0fa44617bc99337a9b57f390', 'finish_reason': {'type': 'stop', 'matched': 151643}, 'prompt_tokens': 20, 'completion_tokens': 711, 'cached_tokens': 1, 'e2e_latency': 7.015674352645874}}
reasoing_content: Okay, so I need to provide the information about the capital of France in JSON format. Hmm, I'm not exactly sure where the capital of France is, but I think it's Paris. Yeah, I remember hearing that Paris is the capital. Let me think about what details I should include.

First, the basic info: country, city, population, and maybe some key landmarks. I know the population is around 2 million, but I'm not sure of the exact number. I think it's approximately 2,165,000. As for landmarks, the Eiffel Tower is a must. The Louvre Museum is another famous spot. The Arc de Triomphe is also iconic. Maybe the Seine River is important too since it's a major river in the city.

I should structure this in JSON. So, the main object would have a "country" key pointing to "France," and the "capital" key pointing to "Paris." Under "capital," I can have an "info" array that includes population, landmarks, and maybe some other info like the date it became the capital or the official name. Wait, I think Paris has been the capital since the 10th century, but I'm not certain about the exact year. I'll just put around 978 AD for now.

Let me make sure I'm not missing anything important. Maybe the official name of the city is "City of Light," which is the official name of Paris. Including that could be helpful. Also, perhaps the area of the city, but I don't remember the exact figure. Maybe around 105 square kilometers? I'm not sure, but I'll include it as an estimate.

Putting it all together, the JSON structure would have a top-level object with "country" and "capital." The "capital" would have an "info" array with population, landmarks, area, and official name. I think that covers the essentials. I should double-check some of these numbers to make sure they're accurate, but since I'm just recalling, I'll proceed with what I have.

Wait, I should also consider if there are any other notable facts. Maybe the number of hospitals or universities? I'm not sure, but perhaps that's too detailed. Keeping it simple with population, landmarks, area, and official name seems sufficient.

So, to summarize, the JSON would look like this:

{
"country": "France",
"capital": {
"info": [
{
"key": "population",
"value": "2,165,000"
},
{
"key": "landmarks",
"value": [
"Eiffel Tower",
"Louvre Museum",
"Arc de Triomphe",
"Seine River"
]
},
{
"key": "area",
"value": "105 km²"
},
{
"key": "official_name",
"value": "City of Light"
},
{
"key": "capital_year",
"value": "978 AD"
}
]
}
}

I think that's a comprehensive yet concise structure. I hope I didn't miss any important details, but this should cover the basics about the capital of France.


content: {'name': 'Paris', 'population': 2165000}

JSON Schema Directly

[8]:
json_schema = json.dumps(
    {
        "type": "object",
        "properties": {
            "name": {"type": "string", "pattern": "^[\\w]+$"},
            "population": {"type": "integer"},
        },
        "required": ["name", "population"],
    }
)

# JSON
text = tokenizer.apply_chat_template(text, tokenize=False, add_generation_prompt=True)
response = requests.post(
    f"http://localhost:{port}/generate",
    json={
        "text": text,
        "sampling_params": {
            "temperature": 0,
            "max_new_tokens": 2048,
            "json_schema": json_schema,
        },
    },
)

print_highlight(response.json())
[2025-04-13 23:30:50 TP0] Prefill batch. #new-seq: 1, #new-token: 3, #cached-token: 2, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-13 23:30:50 TP0] Decode batch. #running-req: 1, #token: 40, token usage: 0.00, gen throughput (token/s): 96.58, #queue-req: 0,
[2025-04-13 23:30:51 TP0] Decode batch. #running-req: 1, #token: 80, token usage: 0.00, gen throughput (token/s): 102.10, #queue-req: 0,
[2025-04-13 23:30:51 TP0] Decode batch. #running-req: 1, #token: 120, token usage: 0.01, gen throughput (token/s): 103.34, #queue-req: 0,
[2025-04-13 23:30:52 TP0] Decode batch. #running-req: 1, #token: 160, token usage: 0.01, gen throughput (token/s): 99.35, #queue-req: 0,
[2025-04-13 23:30:52 TP0] Decode batch. #running-req: 1, #token: 200, token usage: 0.01, gen throughput (token/s): 98.96, #queue-req: 0,
[2025-04-13 23:30:52 TP0] Decode batch. #running-req: 1, #token: 240, token usage: 0.01, gen throughput (token/s): 104.68, #queue-req: 0,
[2025-04-13 23:30:53 TP0] Decode batch. #running-req: 1, #token: 280, token usage: 0.01, gen throughput (token/s): 95.68, #queue-req: 0,
[2025-04-13 23:30:53 TP0] Decode batch. #running-req: 1, #token: 320, token usage: 0.02, gen throughput (token/s): 96.43, #queue-req: 0,
[2025-04-13 23:30:54 TP0] Decode batch. #running-req: 1, #token: 360, token usage: 0.02, gen throughput (token/s): 86.66, #queue-req: 0,
[2025-04-13 23:30:54 TP0] Decode batch. #running-req: 1, #token: 400, token usage: 0.02, gen throughput (token/s): 101.76, #queue-req: 0,
[2025-04-13 23:30:54 TP0] Decode batch. #running-req: 1, #token: 440, token usage: 0.02, gen throughput (token/s): 101.13, #queue-req: 0,
[2025-04-13 23:30:55 TP0] Decode batch. #running-req: 1, #token: 480, token usage: 0.02, gen throughput (token/s): 104.46, #queue-req: 0,
[2025-04-13 23:30:55 TP0] Decode batch. #running-req: 1, #token: 520, token usage: 0.03, gen throughput (token/s): 103.08, #queue-req: 0,
[2025-04-13 23:30:56 TP0] Decode batch. #running-req: 1, #token: 560, token usage: 0.03, gen throughput (token/s): 100.09, #queue-req: 0,
[2025-04-13 23:30:56 TP0] Decode batch. #running-req: 1, #token: 600, token usage: 0.03, gen throughput (token/s): 100.23, #queue-req: 0,
[2025-04-13 23:30:56 TP0] Decode batch. #running-req: 1, #token: 640, token usage: 0.03, gen throughput (token/s): 103.56, #queue-req: 0,
[2025-04-13 23:30:57 TP0] Decode batch. #running-req: 1, #token: 680, token usage: 0.03, gen throughput (token/s): 99.61, #queue-req: 0,
[2025-04-13 23:30:57 TP0] Decode batch. #running-req: 1, #token: 720, token usage: 0.04, gen throughput (token/s): 100.73, #queue-req: 0,
[2025-04-13 23:30:58 TP0] Decode batch. #running-req: 1, #token: 760, token usage: 0.04, gen throughput (token/s): 103.48, #queue-req: 0,
[2025-04-13 23:30:58 TP0] Decode batch. #running-req: 1, #token: 800, token usage: 0.04, gen throughput (token/s): 105.93, #queue-req: 0,
[2025-04-13 23:30:58 TP0] Decode batch. #running-req: 1, #token: 840, token usage: 0.04, gen throughput (token/s): 102.78, #queue-req: 0,
[2025-04-13 23:30:59 TP0] Decode batch. #running-req: 1, #token: 880, token usage: 0.04, gen throughput (token/s): 99.56, #queue-req: 0,
[2025-04-13 23:30:59 TP0] Decode batch. #running-req: 1, #token: 920, token usage: 0.04, gen throughput (token/s): 105.61, #queue-req: 0,
[2025-04-13 23:31:00 TP0] Decode batch. #running-req: 1, #token: 960, token usage: 0.05, gen throughput (token/s): 101.18, #queue-req: 0,
[2025-04-13 23:31:00 TP0] Decode batch. #running-req: 1, #token: 1000, token usage: 0.05, gen throughput (token/s): 103.55, #queue-req: 0,
[2025-04-13 23:31:00 TP0] Decode batch. #running-req: 1, #token: 1040, token usage: 0.05, gen throughput (token/s): 104.72, #queue-req: 0,
[2025-04-13 23:31:01 TP0] Decode batch. #running-req: 1, #token: 1080, token usage: 0.05, gen throughput (token/s): 100.00, #queue-req: 0,
[2025-04-13 23:31:01 TP0] Decode batch. #running-req: 1, #token: 1120, token usage: 0.05, gen throughput (token/s): 100.98, #queue-req: 0,
[2025-04-13 23:31:01 TP0] Decode batch. #running-req: 1, #token: 1160, token usage: 0.06, gen throughput (token/s): 100.31, #queue-req: 0,
[2025-04-13 23:31:02 TP0] Decode batch. #running-req: 1, #token: 1200, token usage: 0.06, gen throughput (token/s): 103.86, #queue-req: 0,
[2025-04-13 23:31:02 TP0] Decode batch. #running-req: 1, #token: 1240, token usage: 0.06, gen throughput (token/s): 103.25, #queue-req: 0,
[2025-04-13 23:31:03 TP0] Decode batch. #running-req: 1, #token: 1280, token usage: 0.06, gen throughput (token/s): 102.33, #queue-req: 0,
[2025-04-13 23:31:03 TP0] Decode batch. #running-req: 1, #token: 1320, token usage: 0.06, gen throughput (token/s): 100.21, #queue-req: 0,
[2025-04-13 23:31:03 TP0] Decode batch. #running-req: 1, #token: 1360, token usage: 0.07, gen throughput (token/s): 106.13, #queue-req: 0,
[2025-04-13 23:31:04 TP0] Decode batch. #running-req: 1, #token: 1400, token usage: 0.07, gen throughput (token/s): 101.20, #queue-req: 0,
[2025-04-13 23:31:04 TP0] Decode batch. #running-req: 1, #token: 1440, token usage: 0.07, gen throughput (token/s): 97.64, #queue-req: 0,
[2025-04-13 23:31:05 TP0] Decode batch. #running-req: 1, #token: 1480, token usage: 0.07, gen throughput (token/s): 98.03, #queue-req: 0,
[2025-04-13 23:31:05 TP0] Decode batch. #running-req: 1, #token: 1520, token usage: 0.07, gen throughput (token/s): 102.17, #queue-req: 0,
[2025-04-13 23:31:05 TP0] Decode batch. #running-req: 1, #token: 1560, token usage: 0.08, gen throughput (token/s): 103.77, #queue-req: 0,
[2025-04-13 23:31:06 TP0] Decode batch. #running-req: 1, #token: 1600, token usage: 0.08, gen throughput (token/s): 99.89, #queue-req: 0,
[2025-04-13 23:31:06 TP0] Decode batch. #running-req: 1, #token: 1640, token usage: 0.08, gen throughput (token/s): 105.33, #queue-req: 0,
[2025-04-13 23:31:07 TP0] Decode batch. #running-req: 1, #token: 1680, token usage: 0.08, gen throughput (token/s): 91.14, #queue-req: 0,
[2025-04-13 23:31:07 TP0] Decode batch. #running-req: 1, #token: 1720, token usage: 0.08, gen throughput (token/s): 100.51, #queue-req: 0,
[2025-04-13 23:31:07 TP0] Decode batch. #running-req: 1, #token: 1760, token usage: 0.09, gen throughput (token/s): 103.10, #queue-req: 0,
[2025-04-13 23:31:08 TP0] Decode batch. #running-req: 1, #token: 1800, token usage: 0.09, gen throughput (token/s): 105.43, #queue-req: 0,
[2025-04-13 23:31:08 TP0] Decode batch. #running-req: 1, #token: 1840, token usage: 0.09, gen throughput (token/s): 100.90, #queue-req: 0,
[2025-04-13 23:31:09 TP0] Decode batch. #running-req: 1, #token: 1880, token usage: 0.09, gen throughput (token/s): 102.81, #queue-req: 0,
[2025-04-13 23:31:09 TP0] Decode batch. #running-req: 1, #token: 1920, token usage: 0.09, gen throughput (token/s): 101.28, #queue-req: 0,
[2025-04-13 23:31:09 TP0] Decode batch. #running-req: 1, #token: 1960, token usage: 0.10, gen throughput (token/s): 100.65, #queue-req: 0,
[2025-04-13 23:31:10 TP0] Decode batch. #running-req: 1, #token: 2000, token usage: 0.10, gen throughput (token/s): 106.44, #queue-req: 0,
[2025-04-13 23:31:10 TP0] Decode batch. #running-req: 1, #token: 2040, token usage: 0.10, gen throughput (token/s): 100.84, #queue-req: 0,
[2025-04-13 23:31:10] INFO:     127.0.0.1:41506 - "POST /generate HTTP/1.1" 200 OK
{'text': 'Okay, so I need to figure out how to solve this problem. Hmm, wait, the user just said "Anshul" and then "Please reason step by step, and put your final answer within \\boxed{}." But there\'s no specific question here. Maybe they forgot to ask a question? Or perhaps they\'re referring to a previous conversation? I\'m a bit confused. Let me think about what I can do.\n\nMaybe I can prompt them to provide more details or clarify what they need help with. I should respond politely and ask for clarification. That way, I can assist them better once I understand their request.\n\nSo, I\'ll say something like, "Hello! It seems like your question got cut off. Could you please provide more details or clarify what you need help with? I\'m here to assist you!"\n\nYeah, that should do it. It\'s friendly and encourages them to share more information so I can help them effectively.\n{\n\n"name" \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n', 'meta_info': {'id': '48c8ee703d26452587e7ee47270e152d', 'finish_reason': {'type': 'length', 'length': 2048}, 'prompt_tokens': 5, 'completion_tokens': 2048, 'cached_tokens': 2, 'e2e_latency': 20.29872751235962}}

EBNF#

[9]:
response = requests.post(
    f"http://localhost:{port}/generate",
    json={
        "text": "Give me the information of the capital of France.",
        "sampling_params": {
            "max_new_tokens": 2048,
            "temperature": 0,
            "n": 3,
            "ebnf": (
                "root ::= city | description\n"
                'city ::= "London" | "Paris" | "Berlin" | "Rome"\n'
                'description ::= city " is " status\n'
                'status ::= "the capital of " country\n'
                'country ::= "England" | "France" | "Germany" | "Italy"'
            ),
        },
        "stream": False,
        "return_logprob": False,
    },
)

print(response.json())
[2025-04-13 23:31:10 TP0] Prefill batch. #new-seq: 1, #new-token: 10, #cached-token: 1, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-13 23:31:10 TP0] Prefill batch. #new-seq: 3, #new-token: 3, #cached-token: 30, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-13 23:31:11 TP0] Decode batch. #running-req: 3, #token: 89, token usage: 0.00, gen throughput (token/s): 94.80, #queue-req: 0,
[2025-04-13 23:31:12 TP0] Decode batch. #running-req: 3, #token: 209, token usage: 0.01, gen throughput (token/s): 238.91, #queue-req: 0,
[2025-04-13 23:31:12 TP0] Decode batch. #running-req: 3, #token: 329, token usage: 0.02, gen throughput (token/s): 287.40, #queue-req: 0,
[2025-04-13 23:31:12 TP0] Decode batch. #running-req: 3, #token: 449, token usage: 0.02, gen throughput (token/s): 297.66, #queue-req: 0,
[2025-04-13 23:31:13 TP0] Decode batch. #running-req: 3, #token: 569, token usage: 0.03, gen throughput (token/s): 290.86, #queue-req: 0,
[2025-04-13 23:31:13 TP0] Decode batch. #running-req: 3, #token: 689, token usage: 0.03, gen throughput (token/s): 290.59, #queue-req: 0,
[2025-04-13 23:31:14 TP0] Decode batch. #running-req: 3, #token: 809, token usage: 0.04, gen throughput (token/s): 286.12, #queue-req: 0,
[2025-04-13 23:31:14 TP0] Decode batch. #running-req: 3, #token: 929, token usage: 0.05, gen throughput (token/s): 294.65, #queue-req: 0,
[2025-04-13 23:31:14 TP0] Decode batch. #running-req: 3, #token: 1049, token usage: 0.05, gen throughput (token/s): 289.46, #queue-req: 0,
[2025-04-13 23:31:15 TP0] Decode batch. #running-req: 3, #token: 1169, token usage: 0.06, gen throughput (token/s): 283.22, #queue-req: 0,
[2025-04-13 23:31:15 TP0] Decode batch. #running-req: 3, #token: 1289, token usage: 0.06, gen throughput (token/s): 288.75, #queue-req: 0,
[2025-04-13 23:31:16 TP0] Decode batch. #running-req: 3, #token: 1409, token usage: 0.07, gen throughput (token/s): 295.65, #queue-req: 0,
[2025-04-13 23:31:16 TP0] Decode batch. #running-req: 3, #token: 1529, token usage: 0.07, gen throughput (token/s): 288.39, #queue-req: 0,
[2025-04-13 23:31:17 TP0] Decode batch. #running-req: 3, #token: 1649, token usage: 0.08, gen throughput (token/s): 287.87, #queue-req: 0,
[2025-04-13 23:31:17 TP0] Decode batch. #running-req: 3, #token: 1769, token usage: 0.09, gen throughput (token/s): 281.26, #queue-req: 0,
[2025-04-13 23:31:17 TP0] Decode batch. #running-req: 3, #token: 1889, token usage: 0.09, gen throughput (token/s): 285.70, #queue-req: 0,
[2025-04-13 23:31:18 TP0] Decode batch. #running-req: 3, #token: 2009, token usage: 0.10, gen throughput (token/s): 287.65, #queue-req: 0,
[2025-04-13 23:31:18 TP0] Decode batch. #running-req: 3, #token: 2129, token usage: 0.10, gen throughput (token/s): 285.09, #queue-req: 0,
[2025-04-13 23:31:19 TP0] Decode batch. #running-req: 3, #token: 2249, token usage: 0.11, gen throughput (token/s): 287.94, #queue-req: 0,
[2025-04-13 23:31:19 TP0] Decode batch. #running-req: 3, #token: 2369, token usage: 0.12, gen throughput (token/s): 286.94, #queue-req: 0,
[2025-04-13 23:31:20 TP0] Decode batch. #running-req: 3, #token: 2489, token usage: 0.12, gen throughput (token/s): 288.24, #queue-req: 0,
[2025-04-13 23:31:20 TP0] Decode batch. #running-req: 3, #token: 2609, token usage: 0.13, gen throughput (token/s): 287.29, #queue-req: 0,
[2025-04-13 23:31:20 TP0] Decode batch. #running-req: 3, #token: 2729, token usage: 0.13, gen throughput (token/s): 288.93, #queue-req: 0,
[2025-04-13 23:31:21 TP0] Decode batch. #running-req: 3, #token: 2849, token usage: 0.14, gen throughput (token/s): 284.93, #queue-req: 0,
[2025-04-13 23:31:21 TP0] Decode batch. #running-req: 3, #token: 2969, token usage: 0.14, gen throughput (token/s): 264.91, #queue-req: 0,
[2025-04-13 23:31:22 TP0] Decode batch. #running-req: 3, #token: 3089, token usage: 0.15, gen throughput (token/s): 292.65, #queue-req: 0,
[2025-04-13 23:31:22 TP0] Decode batch. #running-req: 3, #token: 3209, token usage: 0.16, gen throughput (token/s): 285.27, #queue-req: 0,
[2025-04-13 23:31:22 TP0] Decode batch. #running-req: 3, #token: 3329, token usage: 0.16, gen throughput (token/s): 282.96, #queue-req: 0,
[2025-04-13 23:31:23 TP0] Decode batch. #running-req: 3, #token: 3449, token usage: 0.17, gen throughput (token/s): 281.96, #queue-req: 0,
[2025-04-13 23:31:23 TP0] Decode batch. #running-req: 3, #token: 3569, token usage: 0.17, gen throughput (token/s): 285.78, #queue-req: 0,
[2025-04-13 23:31:24 TP0] Decode batch. #running-req: 3, #token: 3689, token usage: 0.18, gen throughput (token/s): 279.50, #queue-req: 0,
[2025-04-13 23:31:24 TP0] Decode batch. #running-req: 3, #token: 3809, token usage: 0.19, gen throughput (token/s): 280.21, #queue-req: 0,
[2025-04-13 23:31:25 TP0] Decode batch. #running-req: 3, #token: 3929, token usage: 0.19, gen throughput (token/s): 289.71, #queue-req: 0,
[2025-04-13 23:31:25 TP0] Decode batch. #running-req: 3, #token: 4049, token usage: 0.20, gen throughput (token/s): 289.08, #queue-req: 0,
[2025-04-13 23:31:25 TP0] Decode batch. #running-req: 3, #token: 4169, token usage: 0.20, gen throughput (token/s): 288.95, #queue-req: 0,
[2025-04-13 23:31:26 TP0] Decode batch. #running-req: 3, #token: 4289, token usage: 0.21, gen throughput (token/s): 287.12, #queue-req: 0,
[2025-04-13 23:31:26 TP0] Decode batch. #running-req: 3, #token: 4409, token usage: 0.22, gen throughput (token/s): 281.01, #queue-req: 0,
[2025-04-13 23:31:27 TP0] Decode batch. #running-req: 3, #token: 4529, token usage: 0.22, gen throughput (token/s): 293.17, #queue-req: 0,
[2025-04-13 23:31:27 TP0] Decode batch. #running-req: 3, #token: 4649, token usage: 0.23, gen throughput (token/s): 278.38, #queue-req: 0,
[2025-04-13 23:31:28 TP0] Decode batch. #running-req: 3, #token: 4769, token usage: 0.23, gen throughput (token/s): 290.55, #queue-req: 0,
[2025-04-13 23:31:28 TP0] Decode batch. #running-req: 3, #token: 4889, token usage: 0.24, gen throughput (token/s): 285.66, #queue-req: 0,
[2025-04-13 23:31:28 TP0] Decode batch. #running-req: 3, #token: 5009, token usage: 0.24, gen throughput (token/s): 289.00, #queue-req: 0,
[2025-04-13 23:31:29 TP0] Decode batch. #running-req: 3, #token: 5129, token usage: 0.25, gen throughput (token/s): 291.12, #queue-req: 0,
[2025-04-13 23:31:29 TP0] Decode batch. #running-req: 3, #token: 5249, token usage: 0.26, gen throughput (token/s): 291.13, #queue-req: 0,
[2025-04-13 23:31:30 TP0] Decode batch. #running-req: 3, #token: 5369, token usage: 0.26, gen throughput (token/s): 288.17, #queue-req: 0,
[2025-04-13 23:31:30 TP0] Decode batch. #running-req: 3, #token: 5489, token usage: 0.27, gen throughput (token/s): 283.20, #queue-req: 0,
[2025-04-13 23:31:30 TP0] Decode batch. #running-req: 3, #token: 5609, token usage: 0.27, gen throughput (token/s): 283.36, #queue-req: 0,
[2025-04-13 23:31:31 TP0] Decode batch. #running-req: 3, #token: 5729, token usage: 0.28, gen throughput (token/s): 276.66, #queue-req: 0,
[2025-04-13 23:31:31 TP0] Decode batch. #running-req: 3, #token: 5849, token usage: 0.29, gen throughput (token/s): 284.40, #queue-req: 0,
[2025-04-13 23:31:32 TP0] Decode batch. #running-req: 3, #token: 5969, token usage: 0.29, gen throughput (token/s): 285.30, #queue-req: 0,
[2025-04-13 23:31:32 TP0] Decode batch. #running-req: 3, #token: 6089, token usage: 0.30, gen throughput (token/s): 293.29, #queue-req: 0,
[2025-04-13 23:31:32] INFO:     127.0.0.1:59456 - "POST /generate HTTP/1.1" 200 OK
[{'text': "600 words.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bour", 'meta_info': {'id': 'af17f7beef744024b735db94e756ebf4', 'finish_reason': {'type': 'length', 'length': 2048}, 'prompt_tokens': 11, 'completion_tokens': 2048, 'cached_tokens': 10, 'e2e_latency': 22.01105237007141}}, {'text': "600 words.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bour", 'meta_info': {'id': '72d81d2aaaad4d769e51a0594f540a24', 'finish_reason': {'type': 'length', 'length': 2048}, 'prompt_tokens': 11, 'completion_tokens': 2048, 'cached_tokens': 10, 'e2e_latency': 22.011062622070312}}, {'text': "600 words.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bourguignon being some of the most popular. The city is surrounded by the Seine River, which flows through it, and the bridges over the river add to the city's charm. Paris is a vibrant city with a mix of old-world charm and modern innovation, making it a unique and fascinating place to visit.\n\nThe capital of France is Paris. Paris is one of the most important cities in the world, and it's also the political, cultural, and economic center of France. The city has a rich history that dates back to ancient times, and it's known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Arc de Triomphe. Paris is also famous for its cuisine, with dishes like baguette, croissant, and boeuf bour", 'meta_info': {'id': '859d3491111f4b29883396e1593afafc', 'finish_reason': {'type': 'length', 'length': 2048}, 'prompt_tokens': 11, 'completion_tokens': 2048, 'cached_tokens': 10, 'e2e_latency': 22.011067390441895}}]

Regular expression#

[10]:
response = requests.post(
    f"http://localhost:{port}/generate",
    json={
        "text": "Paris is the capital of",
        "sampling_params": {
            "temperature": 0,
            "max_new_tokens": 2048,
            "regex": "(France|England)",
        },
    },
)
print(response.json())
[2025-04-13 23:31:32 TP0] Prefill batch. #new-seq: 1, #new-token: 5, #cached-token: 1, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-13 23:31:33 TP0] Decode batch. #running-req: 1, #token: 24, token usage: 0.00, gen throughput (token/s): 198.12, #queue-req: 0,
[2025-04-13 23:31:33 TP0] Decode batch. #running-req: 1, #token: 64, token usage: 0.00, gen throughput (token/s): 101.47, #queue-req: 0,
[2025-04-13 23:31:33 TP0] Decode batch. #running-req: 1, #token: 104, token usage: 0.01, gen throughput (token/s): 102.97, #queue-req: 0,
[2025-04-13 23:31:34 TP0] Decode batch. #running-req: 1, #token: 144, token usage: 0.01, gen throughput (token/s): 103.00, #queue-req: 0,
[2025-04-13 23:31:34 TP0] Decode batch. #running-req: 1, #token: 184, token usage: 0.01, gen throughput (token/s): 102.03, #queue-req: 0,
[2025-04-13 23:31:35 TP0] Decode batch. #running-req: 1, #token: 224, token usage: 0.01, gen throughput (token/s): 100.68, #queue-req: 0,
[2025-04-13 23:31:35 TP0] Decode batch. #running-req: 1, #token: 264, token usage: 0.01, gen throughput (token/s): 100.83, #queue-req: 0,
[2025-04-13 23:31:35 TP0] Decode batch. #running-req: 1, #token: 304, token usage: 0.01, gen throughput (token/s): 101.06, #queue-req: 0,
[2025-04-13 23:31:36 TP0] Decode batch. #running-req: 1, #token: 344, token usage: 0.02, gen throughput (token/s): 100.79, #queue-req: 0,
[2025-04-13 23:31:36 TP0] Decode batch. #running-req: 1, #token: 384, token usage: 0.02, gen throughput (token/s): 100.72, #queue-req: 0,
[2025-04-13 23:31:36 TP0] Decode batch. #running-req: 1, #token: 424, token usage: 0.02, gen throughput (token/s): 103.14, #queue-req: 0,
[2025-04-13 23:31:37 TP0] Decode batch. #running-req: 1, #token: 464, token usage: 0.02, gen throughput (token/s): 98.50, #queue-req: 0,
[2025-04-13 23:31:37 TP0] Decode batch. #running-req: 1, #token: 504, token usage: 0.02, gen throughput (token/s): 102.65, #queue-req: 0,
[2025-04-13 23:31:38 TP0] Decode batch. #running-req: 1, #token: 544, token usage: 0.03, gen throughput (token/s): 102.06, #queue-req: 0,
[2025-04-13 23:31:38 TP0] Decode batch. #running-req: 1, #token: 584, token usage: 0.03, gen throughput (token/s): 102.74, #queue-req: 0,
[2025-04-13 23:31:38 TP0] Decode batch. #running-req: 1, #token: 624, token usage: 0.03, gen throughput (token/s): 105.03, #queue-req: 0,
[2025-04-13 23:31:39 TP0] Decode batch. #running-req: 1, #token: 664, token usage: 0.03, gen throughput (token/s): 103.61, #queue-req: 0,
[2025-04-13 23:31:39 TP0] Decode batch. #running-req: 1, #token: 704, token usage: 0.03, gen throughput (token/s): 102.26, #queue-req: 0,
[2025-04-13 23:31:40 TP0] Decode batch. #running-req: 1, #token: 744, token usage: 0.04, gen throughput (token/s): 102.23, #queue-req: 0,
[2025-04-13 23:31:40 TP0] Decode batch. #running-req: 1, #token: 784, token usage: 0.04, gen throughput (token/s): 99.86, #queue-req: 0,
[2025-04-13 23:31:40 TP0] Decode batch. #running-req: 1, #token: 824, token usage: 0.04, gen throughput (token/s): 102.12, #queue-req: 0,
[2025-04-13 23:31:41 TP0] Decode batch. #running-req: 1, #token: 864, token usage: 0.04, gen throughput (token/s): 104.79, #queue-req: 0,
[2025-04-13 23:31:41 TP0] Decode batch. #running-req: 1, #token: 904, token usage: 0.04, gen throughput (token/s): 94.13, #queue-req: 0,
[2025-04-13 23:31:42 TP0] Decode batch. #running-req: 1, #token: 944, token usage: 0.05, gen throughput (token/s): 102.35, #queue-req: 0,
[2025-04-13 23:31:42 TP0] Decode batch. #running-req: 1, #token: 984, token usage: 0.05, gen throughput (token/s): 101.05, #queue-req: 0,
[2025-04-13 23:31:42 TP0] Decode batch. #running-req: 1, #token: 1024, token usage: 0.05, gen throughput (token/s): 101.50, #queue-req: 0,
[2025-04-13 23:31:43 TP0] Decode batch. #running-req: 1, #token: 1064, token usage: 0.05, gen throughput (token/s): 103.95, #queue-req: 0,
[2025-04-13 23:31:43 TP0] Decode batch. #running-req: 1, #token: 1104, token usage: 0.05, gen throughput (token/s): 102.15, #queue-req: 0,
[2025-04-13 23:31:44 TP0] Decode batch. #running-req: 1, #token: 1144, token usage: 0.06, gen throughput (token/s): 99.51, #queue-req: 0,
[2025-04-13 23:31:44 TP0] Decode batch. #running-req: 1, #token: 1184, token usage: 0.06, gen throughput (token/s): 104.83, #queue-req: 0,
[2025-04-13 23:31:44 TP0] Decode batch. #running-req: 1, #token: 1224, token usage: 0.06, gen throughput (token/s): 103.11, #queue-req: 0,
[2025-04-13 23:31:45 TP0] Decode batch. #running-req: 1, #token: 1264, token usage: 0.06, gen throughput (token/s): 101.16, #queue-req: 0,
[2025-04-13 23:31:45 TP0] Decode batch. #running-req: 1, #token: 1304, token usage: 0.06, gen throughput (token/s): 106.05, #queue-req: 0,
[2025-04-13 23:31:46 TP0] Decode batch. #running-req: 1, #token: 1344, token usage: 0.07, gen throughput (token/s): 102.74, #queue-req: 0,
[2025-04-13 23:31:46 TP0] Decode batch. #running-req: 1, #token: 1384, token usage: 0.07, gen throughput (token/s): 100.46, #queue-req: 0,
[2025-04-13 23:31:46 TP0] Decode batch. #running-req: 1, #token: 1424, token usage: 0.07, gen throughput (token/s): 100.83, #queue-req: 0,
[2025-04-13 23:31:47 TP0] Decode batch. #running-req: 1, #token: 1464, token usage: 0.07, gen throughput (token/s): 105.42, #queue-req: 0,
[2025-04-13 23:31:47 TP0] Decode batch. #running-req: 1, #token: 1504, token usage: 0.07, gen throughput (token/s): 103.43, #queue-req: 0,
[2025-04-13 23:31:47 TP0] Decode batch. #running-req: 1, #token: 1544, token usage: 0.08, gen throughput (token/s): 100.49, #queue-req: 0,
[2025-04-13 23:31:48 TP0] Decode batch. #running-req: 1, #token: 1584, token usage: 0.08, gen throughput (token/s): 100.68, #queue-req: 0,
[2025-04-13 23:31:48 TP0] Decode batch. #running-req: 1, #token: 1624, token usage: 0.08, gen throughput (token/s): 103.31, #queue-req: 0,
[2025-04-13 23:31:49 TP0] Decode batch. #running-req: 1, #token: 1664, token usage: 0.08, gen throughput (token/s): 103.46, #queue-req: 0,
[2025-04-13 23:31:49 TP0] Decode batch. #running-req: 1, #token: 1704, token usage: 0.08, gen throughput (token/s): 105.26, #queue-req: 0,
[2025-04-13 23:31:49 TP0] Decode batch. #running-req: 1, #token: 1744, token usage: 0.09, gen throughput (token/s): 103.16, #queue-req: 0,
[2025-04-13 23:31:50 TP0] Decode batch. #running-req: 1, #token: 1784, token usage: 0.09, gen throughput (token/s): 100.41, #queue-req: 0,
[2025-04-13 23:31:50 TP0] Decode batch. #running-req: 1, #token: 1824, token usage: 0.09, gen throughput (token/s): 102.23, #queue-req: 0,
[2025-04-13 23:31:51 TP0] Decode batch. #running-req: 1, #token: 1864, token usage: 0.09, gen throughput (token/s): 105.04, #queue-req: 0,
[2025-04-13 23:31:51 TP0] Decode batch. #running-req: 1, #token: 1904, token usage: 0.09, gen throughput (token/s): 102.97, #queue-req: 0,
[2025-04-13 23:31:51 TP0] Decode batch. #running-req: 1, #token: 1944, token usage: 0.09, gen throughput (token/s): 102.93, #queue-req: 0,
[2025-04-13 23:31:52 TP0] Decode batch. #running-req: 1, #token: 1984, token usage: 0.10, gen throughput (token/s): 99.21, #queue-req: 0,
[2025-04-13 23:31:52 TP0] Decode batch. #running-req: 1, #token: 2024, token usage: 0.10, gen throughput (token/s): 102.23, #queue-req: 0,
[2025-04-13 23:31:52] INFO:     127.0.0.1:50374 - "POST /generate HTTP/1.1" 200 OK
{'text': ' the \\( n \\9121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121212121', 'meta_info': {'id': '6d46c6ad9eb4426d9d92f471825a8ea2', 'finish_reason': {'type': 'length', 'length': 2048}, 'prompt_tokens': 6, 'completion_tokens': 2048, 'cached_tokens': 1, 'e2e_latency': 20.07008123397827}}

Structural Tag#

[11]:
text = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
payload = {
    "text": text,
    "sampling_params": {
        "max_new_tokens": 2048,
        "structural_tag": json.dumps(
            {
                "type": "structural_tag",
                "structures": [
                    {
                        "begin": "<function=get_current_weather>",
                        "schema": schema_get_current_weather,
                        "end": "</function>",
                    },
                    {
                        "begin": "<function=get_current_date>",
                        "schema": schema_get_current_date,
                        "end": "</function>",
                    },
                ],
                "triggers": ["<function="],
            }
        ),
    },
}


# Send POST request to the API endpoint
response = requests.post(f"http://localhost:{port}/generate", json=payload)
print_highlight(response.json())
[2025-04-13 23:31:52 TP0] Prefill batch. #new-seq: 1, #new-token: 1, #cached-token: 19, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-13 23:31:53 TP0] Decode batch. #running-req: 1, #token: 30, token usage: 0.00, gen throughput (token/s): 100.02, #queue-req: 0,
[2025-04-13 23:31:53 TP0] Decode batch. #running-req: 1, #token: 70, token usage: 0.00, gen throughput (token/s): 102.53, #queue-req: 0,
[2025-04-13 23:31:53 TP0] Decode batch. #running-req: 1, #token: 110, token usage: 0.01, gen throughput (token/s): 100.16, #queue-req: 0,
[2025-04-13 23:31:54 TP0] Decode batch. #running-req: 1, #token: 150, token usage: 0.01, gen throughput (token/s): 104.53, #queue-req: 0,
[2025-04-13 23:31:54 TP0] Decode batch. #running-req: 1, #token: 190, token usage: 0.01, gen throughput (token/s): 101.82, #queue-req: 0,
[2025-04-13 23:31:55 TP0] Decode batch. #running-req: 1, #token: 230, token usage: 0.01, gen throughput (token/s): 99.33, #queue-req: 0,
[2025-04-13 23:31:55 TP0] Decode batch. #running-req: 1, #token: 270, token usage: 0.01, gen throughput (token/s): 101.11, #queue-req: 0,
[2025-04-13 23:31:55 TP0] Decode batch. #running-req: 1, #token: 310, token usage: 0.02, gen throughput (token/s): 101.64, #queue-req: 0,
[2025-04-13 23:31:56 TP0] Decode batch. #running-req: 1, #token: 350, token usage: 0.02, gen throughput (token/s): 101.37, #queue-req: 0,
[2025-04-13 23:31:56 TP0] Decode batch. #running-req: 1, #token: 390, token usage: 0.02, gen throughput (token/s): 99.56, #queue-req: 0,
[2025-04-13 23:31:56 TP0] Decode batch. #running-req: 1, #token: 430, token usage: 0.02, gen throughput (token/s): 101.65, #queue-req: 0,
[2025-04-13 23:31:57 TP0] Decode batch. #running-req: 1, #token: 470, token usage: 0.02, gen throughput (token/s): 100.61, #queue-req: 0,
[2025-04-13 23:31:57 TP0] Decode batch. #running-req: 1, #token: 510, token usage: 0.02, gen throughput (token/s): 103.10, #queue-req: 0,
[2025-04-13 23:31:58 TP0] Decode batch. #running-req: 1, #token: 550, token usage: 0.03, gen throughput (token/s): 103.02, #queue-req: 0,
[2025-04-13 23:31:58 TP0] Decode batch. #running-req: 1, #token: 590, token usage: 0.03, gen throughput (token/s): 102.10, #queue-req: 0,
[2025-04-13 23:31:58 TP0] Decode batch. #running-req: 1, #token: 630, token usage: 0.03, gen throughput (token/s): 101.95, #queue-req: 0,
[2025-04-13 23:31:59 TP0] Decode batch. #running-req: 1, #token: 670, token usage: 0.03, gen throughput (token/s): 101.85, #queue-req: 0,
[2025-04-13 23:31:59 TP0] Decode batch. #running-req: 1, #token: 710, token usage: 0.03, gen throughput (token/s): 104.43, #queue-req: 0,
[2025-04-13 23:32:00 TP0] Decode batch. #running-req: 1, #token: 750, token usage: 0.04, gen throughput (token/s): 99.48, #queue-req: 0,
[2025-04-13 23:32:00 TP0] Decode batch. #running-req: 1, #token: 790, token usage: 0.04, gen throughput (token/s): 100.96, #queue-req: 0,
[2025-04-13 23:32:00 TP0] Decode batch. #running-req: 1, #token: 830, token usage: 0.04, gen throughput (token/s): 101.17, #queue-req: 0,
[2025-04-13 23:32:01 TP0] Decode batch. #running-req: 1, #token: 870, token usage: 0.04, gen throughput (token/s): 101.81, #queue-req: 0,
[2025-04-13 23:32:01 TP0] Decode batch. #running-req: 1, #token: 910, token usage: 0.04, gen throughput (token/s): 104.05, #queue-req: 0,
[2025-04-13 23:32:02 TP0] Decode batch. #running-req: 1, #token: 950, token usage: 0.05, gen throughput (token/s): 99.63, #queue-req: 0,
[2025-04-13 23:32:02 TP0] Decode batch. #running-req: 1, #token: 990, token usage: 0.05, gen throughput (token/s): 100.93, #queue-req: 0,
[2025-04-13 23:32:02 TP0] Decode batch. #running-req: 1, #token: 1030, token usage: 0.05, gen throughput (token/s): 101.14, #queue-req: 0,
[2025-04-13 23:32:03 TP0] Decode batch. #running-req: 1, #token: 1070, token usage: 0.05, gen throughput (token/s): 104.21, #queue-req: 0,
[2025-04-13 23:32:03 TP0] Decode batch. #running-req: 1, #token: 1110, token usage: 0.05, gen throughput (token/s): 101.94, #queue-req: 0,
[2025-04-13 23:32:04 TP0] Decode batch. #running-req: 1, #token: 1150, token usage: 0.06, gen throughput (token/s): 100.83, #queue-req: 0,
[2025-04-13 23:32:04 TP0] Decode batch. #running-req: 1, #token: 1190, token usage: 0.06, gen throughput (token/s): 100.34, #queue-req: 0,
[2025-04-13 23:32:04 TP0] Decode batch. #running-req: 1, #token: 1230, token usage: 0.06, gen throughput (token/s): 99.42, #queue-req: 0,
[2025-04-13 23:32:05 TP0] Decode batch. #running-req: 1, #token: 1270, token usage: 0.06, gen throughput (token/s): 98.92, #queue-req: 0,
[2025-04-13 23:32:05 TP0] Decode batch. #running-req: 1, #token: 1310, token usage: 0.06, gen throughput (token/s): 97.39, #queue-req: 0,
[2025-04-13 23:32:06 TP0] Decode batch. #running-req: 1, #token: 1350, token usage: 0.07, gen throughput (token/s): 99.28, #queue-req: 0,
[2025-04-13 23:32:06 TP0] Decode batch. #running-req: 1, #token: 1390, token usage: 0.07, gen throughput (token/s): 103.05, #queue-req: 0,
[2025-04-13 23:32:06 TP0] Decode batch. #running-req: 1, #token: 1430, token usage: 0.07, gen throughput (token/s): 99.66, #queue-req: 0,
[2025-04-13 23:32:07 TP0] Decode batch. #running-req: 1, #token: 1470, token usage: 0.07, gen throughput (token/s): 101.23, #queue-req: 0,
[2025-04-13 23:32:07 TP0] Decode batch. #running-req: 1, #token: 1510, token usage: 0.07, gen throughput (token/s): 98.63, #queue-req: 0,
[2025-04-13 23:32:08 TP0] Decode batch. #running-req: 1, #token: 1550, token usage: 0.08, gen throughput (token/s): 98.86, #queue-req: 0,
[2025-04-13 23:32:08 TP0] Decode batch. #running-req: 1, #token: 1590, token usage: 0.08, gen throughput (token/s): 102.83, #queue-req: 0,
[2025-04-13 23:32:08 TP0] Decode batch. #running-req: 1, #token: 1630, token usage: 0.08, gen throughput (token/s): 105.49, #queue-req: 0,
[2025-04-13 23:32:09 TP0] Decode batch. #running-req: 1, #token: 1670, token usage: 0.08, gen throughput (token/s): 101.11, #queue-req: 0,
[2025-04-13 23:32:09 TP0] Decode batch. #running-req: 1, #token: 1710, token usage: 0.08, gen throughput (token/s): 101.21, #queue-req: 0,
[2025-04-13 23:32:10 TP0] Decode batch. #running-req: 1, #token: 1750, token usage: 0.09, gen throughput (token/s): 99.87, #queue-req: 0,
[2025-04-13 23:32:10 TP0] Decode batch. #running-req: 1, #token: 1790, token usage: 0.09, gen throughput (token/s): 102.19, #queue-req: 0,
[2025-04-13 23:32:10 TP0] Decode batch. #running-req: 1, #token: 1830, token usage: 0.09, gen throughput (token/s): 102.56, #queue-req: 0,
[2025-04-13 23:32:11 TP0] Decode batch. #running-req: 1, #token: 1870, token usage: 0.09, gen throughput (token/s): 97.14, #queue-req: 0,
[2025-04-13 23:32:11 TP0] Decode batch. #running-req: 1, #token: 1910, token usage: 0.09, gen throughput (token/s): 102.18, #queue-req: 0,
[2025-04-13 23:32:12 TP0] Decode batch. #running-req: 1, #token: 1950, token usage: 0.10, gen throughput (token/s): 98.78, #queue-req: 0,
[2025-04-13 23:32:12 TP0] Decode batch. #running-req: 1, #token: 1990, token usage: 0.10, gen throughput (token/s): 95.79, #queue-req: 0,
[2025-04-13 23:32:12 TP0] Decode batch. #running-req: 1, #token: 2030, token usage: 0.10, gen throughput (token/s): 100.95, #queue-req: 0,
[2025-04-13 23:32:13] INFO:     127.0.0.1:41268 - "POST /generate HTTP/1.1" 200 OK
{'text': 'Alright, I need to provide the JSON information for the Paris France Capital, specifically Paris East. Let me start by identifying the country, which is France. Next, the city is Paris, and the region is Île-de-France-Mrance. The population is approximately 2,176,057. The area is around 2,308.68 km². I should also mention that it\'s the third most populous city globally. Including notable landmarks like the Eiffel Tower, the Louvre, and the Arc de Triomphe adds value. Notable individuals such as Napoleon Bonaparte and Charles de Gaulle should be included. The government roles include President Élisabeth Borne and Mayor Audrey Védelis. The climate is temperate, with a climate type of CN. Lastly, I\'ll note other regions nearby. I should make sure the JSON is correctly formatted to avoid syntax errors.\n\n\nHere is the JSON information for the Paris France Capital, specifically Paris East:\n\n```json\n{\n "country": "France",\n "city": "Paris",\n "region": "Île-de-France-Mrance",\n "population": 2176057,\n "area": 2308.68,\n "currency": "Euro",\n "founded": 556 BCE,\n "🍁"\'s": [\n "Eiffel Tower",\n "Louvre Museum",\n "Arc de Triomphe"\n ],\n ".toolbar-bell": [\n "Napoleon Bonaparte",\n "Charles de Gaulle",\n "Claudecitation:W\'};\n "Jacques Daval",\n "Georges Bataille",\n "Pierre Léon Jouffrel",\n "Pierre-Auguste Renoir",\n "Pierre-Auguste Renoir",\n "Pierre_auguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Pierreauguste Renoir",\n "Printemps",\n "Gabriel Se gardening",\n " благодаря à Joseph Beuchamp",\n "面积约 4.8 km²",\n "_financial capital of France",\n ".str(FileOrganization)",\n " pandaexport: \'luke",\n " //pudding",\n " // ALEmbre in photographer",\n " // L - l.Compatibility)citation:W",\n " // L - l.[])\n " // Genuine",\n " // authenticity",\n " // guaranteed,\n " // 24 7",\n " // professional service",\n " // Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__",\n " // Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rud",\n " // Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph___Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph__Rudolph', 'meta_info': {'id': '475b79b695d2403099749bf040f5e2eb', 'finish_reason': {'type': 'length', 'length': 2048}, 'prompt_tokens': 20, 'completion_tokens': 2048, 'cached_tokens': 19, 'e2e_latency': 20.285722970962524}}
[12]:
terminate_process(server_process)
[2025-04-13 23:32:13] Child process unexpectedly failed with an exit code 9. pid=2802594
[2025-04-13 23:32:13] Child process unexpectedly failed with an exit code 9. pid=2802528

Offline Engine API#

[13]:
import sglang as sgl

llm = sgl.Engine(
    model_path="deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
    reasoning_parser="deepseek-r1",
    grammar_backend="xgrammar",
)
Loading safetensors checkpoint shards:   0% Completed | 0/2 [00:00<?, ?it/s]
Loading safetensors checkpoint shards:  50% Completed | 1/2 [00:01<00:01,  1.28s/it]
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:02<00:00,  1.23s/it]
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:02<00:00,  1.24s/it]

JSON#

Using Pydantic

[14]:
import json
from pydantic import BaseModel, Field


prompts = [
    "Give me the information of the capital of China in the JSON format.",
    "Give me the information of the capital of France in the JSON format.",
    "Give me the information of the capital of Ireland in the JSON format.",
]


# Define the schema using Pydantic
class CapitalInfo(BaseModel):
    name: str = Field(..., pattern=r"^\w+$", description="Name of the capital city")
    population: int = Field(..., description="Population of the capital city")


sampling_params = {
    "temperature": 0,
    "top_p": 0.95,
    "max_new_tokens": 2048,
    "json_schema": json.dumps(CapitalInfo.model_json_schema()),
}

outputs = llm.generate(prompts, sampling_params)
for prompt, output in zip(prompts, outputs):
    print("===============================")
    print(f"Prompt: {prompt}\nGenerated text: {output['text']}")
===============================
Prompt: Give me the information of the capital of China in the JSON format.
Generated text:  and also, make sure that the JSON is valid.

```json
{
  "name": "Beijing",
  "population": 10000000,
  "area": 100000,
  "founded": 1500,
  "coordinates": {
    "latitude": "40.4168",
    "longitude": "-73.9352"
  }
}
```

Is this JSON valid? If not, explain why.

If it is valid, explain why.

Also, provide an updated version of the JSON with any necessary corrections.
</think>{

"name": "Beijing",
"population": 1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
===============================
Prompt: Give me the information of the capital of France in the JSON format.
Generated text:  and then convert that JSON into a JSONP object.

Also, create a JSON array containing 5 different cities in France, each with their population and area.

Finally, create a JSON object that contains both the information of the capital and the array of 5 cities.

Make sure to include all the required fields: for the city info, it's name, country, population, area, and the capital flag. For the array, each city should have name, population, and area.

Alright, so I need to figure out how to structure this JSON data. Let me start by recalling what the user is asking for. They want the information of the capital of France in JSON format, then convert that into a JSONP object. Also, they need a JSON array with 5 different cities in France, each having population and area. Finally, they want a JSON object that combines both the capital info and the array of cities.

First, I should identify the capital of France. I know that Paris is the capital. So, the JSON for the capital should include its name, country, population, area, and whether it's the capital. Wait, the user mentioned the fields for the city info should be name, country, population, area, and capital flag. So, the JSON for the capital would look like:

{
  "name": "Paris",
  "country": "France",
  "population": 1000000, // I'm not sure about the exact number, but for example's sake, let's say 1 million.
  "area": 100000, // Again, approximate values.
  "capital": true
}

Next, I need to convert this into a JSONP object. JSONP typically wraps the JSON in a function and adds a "func" key. So, it would look like:

{
  "func": function() {
    return {
      "name": "Paris",
      "country": "France",
      "population": 1000000,
      "area": 100000,
      "capital": true
    };
  }
}

Wait, but JSONP often uses a function that returns the JSON object. So, the structure is correct.

Now, moving on to the array of 5 cities in France. I need to list 5 cities, each with their population and area. I should pick well-known cities to make it accurate. Let me think of some major cities in France:

1. Lyon
2. Marseille
3. Toulouse
4. Lille
5. Nîmes

I need to find their approximate populations and areas. I might not have exact numbers, so I'll have to estimate or use known approximate values.

- Lyon: Population around 400,000, area about 100 square kilometers.
- Marseille: Population around 600,000, area about 150 square kilometers.
- Toulouse: Population around 300,000, area about 120 square kilometers.
- Lille: Population around 450,000, area about 110 square kilometers.
- Nîmes: Population around 250,000, area about 80 square kilometers.

So, the JSON array would be:

[
  {
    "name": "Lyon",
    "population": 400000,
    "area": 100
  },
  {
    "name": "Marseille",
    "population": 600000,
    "area": 150
  },
  {
    "name": "Toulouse",
    "population": 300000,
    "area": 120
  },
  {
    "name": "Lille",
    "population": 450000,
    "area": 110
  },
  {
    "name": "Nîmes",
    "population": 250000,
    "area": 80
  }
]

Now, the final part is to create a JSON object that combines both the capital info and this array. So, the top-level object will have a key, say "data", which contains two keys: "capital" and "cities". The "capital" will be the JSON object we created earlier, and "cities" will be the array.

Putting it all together, the JSON would look like:

{
  "data": {
    "capital": {
      "name": "Paris",
      "country": "France",
      "population": 1000000,
      "area": 100000,
      "capital": true
    },
    "cities": [
      {
        "name": "Lyon",
        "population": 400000,
        "area": 100
      },
      {
        "name": "Marseille",
        "population": 600000,
        "area": 150
      },
      {
        "name": "Toulouse",
        "population": 300000,
        "area": 120
      },
      {
        "name": "Lille",
        "population": 450000,
        "area": 110
      },
      {
        "name": "Nîmes",
        "population": 250000,
        "area": 80
      }
    ]
  }
}

I should double-check the fields to ensure all required ones are included. For the capital, name, country, population, area, and capital flag are all there. For each city, name, population, and area are included. The structure seems correct.

I also need to make sure that the JSON is properly formatted, with commas in the right places and brackets closed properly. I think I've got that covered.

Lastly, I should consider if the population and area numbers are realistic. For example, Paris's population is definitely over 2 million, so 1 million might be too low. Similarly, its area is around 100 square kilometers, which seems accurate. I might want to adjust those numbers if I had more precise data, but for the sake of this exercise, these estimates should suffice.

Overall, I think I've covered all the requirements: created the JSON for the capital, converted it to JSONP, made an array of cities, and combined them into a single JSON object. I just need to present this in the correct format without any markdown, as per the instructions.
</think>{

"name": "Paris",
"population": 200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
===============================
Prompt: Give me the information of the capital of Ireland in the JSON format.
Generated text:  and also, add a "description" field that explains what the capital is like.

The capital of Ireland is calledDublin. It's located in the north of Ireland, and is known for its vibrant city life, beautiful streets, and famous landmarks like the SSE St. Patrick's Cathedral and the...
Okay, I need to provide information about the capital of Ireland, which is Dublin, in JSON format. I also need to include a "description" field that explains what Dublin is like. Let me start by recalling the key points about Dublin.

First, the capital is Dublin itself. It's located in the north of Ireland. Dublin is known for its vibrant city life, which means it's lively and dynamic. The streets are beautiful, lined with elegant buildings and greenery. There are several famous landmarks, such as the SSE St. Patrick's Cathedral, which is a prominent religious site. The city also has the famous Guinness Tower, which houses the world's largest bar. Other notable landmarks include the Phoenix Park, which is a large green space with museums and attractions, and the SSE Arena, where many sports events are held.

Additionally, Dublin is known for its cultural and historical significance. It's the birthplace of JamesJoyce, a renowned author, and has a rich history dating back to medieval times. The city has a vibrant nightlife, with many pubs, clubs, and bars. It's also a major transportation hub, with good public transport options like buses and the Luas light rail system.

Now, I should structure this information into a JSON format. The main information includes the name of the capital, its location, and some key landmarks and features. The "description" field should summarize the overall vibe and characteristics of Dublin.

I should make sure the JSON is properly formatted with keys and values, and that the description is concise but informative. I'll list the landmarks and features as bullet points under the "features" key for clarity.

Let me put it all together now.
</think>{

"name": "Dublin",
"population": 50000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

JSON Schema Directly

[15]:
prompts = [
    "Give me the information of the capital of China in the JSON format.",
    "Give me the information of the capital of France in the JSON format.",
    "Give me the information of the capital of Ireland in the JSON format.",
]

json_schema = json.dumps(
    {
        "type": "object",
        "properties": {
            "name": {"type": "string", "pattern": "^[\\w]+$"},
            "population": {"type": "integer"},
        },
        "required": ["name", "population"],
    }
)

sampling_params = {"temperature": 0, "max_new_tokens": 2048, "json_schema": json_schema}

outputs = llm.generate(prompts, sampling_params)
for prompt, output in zip(prompts, outputs):
    print("===============================")
    print(f"Prompt: {prompt}\nGenerated text: {output['text']}")
===============================
Prompt: Give me the information of the capital of China in the JSON format.
Generated text:  and also, make sure that the JSON is valid.

```json
{
  "name": "Beijing",
  "population": 10000000,
  "area": 100000,
  "founded": 1500,
  "coordinates": {
    "latitude": "40.4168",
    "longitude": "-73.9352"
  }
}
```

Is this JSON valid? If not, explain why.

If it is valid, explain why.

Also, provide an updated version of the JSON with any necessary corrections.
</think>{

"name": "Beijing",
"population": 1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
===============================
Prompt: Give me the information of the capital of France in the JSON format.
Generated text:  and then convert that JSON into a JSONP object.

Also, create a JSON array containing 5 different cities in France, each with their population and area.

Finally, create a JSON object that contains both the information of the capital and the array of 5 cities.

Make sure to include all the required fields: for the city info, it's name, country, population, and area. For the array, each city should have name, population, and area.

Alright, so I need to figure out how to provide the information of the capital of France in JSON format, then convert that into a JSONP object. After that, I have to create a JSON array with five different cities in France, each including their population and area. Finally, I need to combine both the capital info and the array into a single JSON object.

First, let me recall what the capital of France is. I think it's Paris. Yes, Paris is the capital city of France. Now, I need to find the population and area of Paris. I'm not exactly sure about the current numbers, but I remember that Paris has a large population and a significant area. Let me check my memory: I think the population is around 2 million and the area is about 105 square kilometers. I should verify these numbers to be accurate, but for the sake of this exercise, I'll go with these figures.

Next, I need to represent this information in JSON format. The structure should include the name, country, population, and area. So, the JSON object for the capital would look like this:

{
  "capital": {
    "name": "Paris",
    "country": "France",
    "population": 2000000,
    "area": 105
  }
}

Now, moving on to the JSONP object. JSONP typically involves wrapping the JSON object in a function and providing a callback. The syntax is usually function() { return JSON; }. So, converting the above JSON into a JSONP object would involve adding a function wrapper. It would look something like:

function() {
  return {
    "capital": {
      "name": "Paris",
      "country": "France",
      "population": 2000000,
      "area": 105
    }
  };
}

I think that's the correct way to structure a JSONP object.

Now, onto the second part: creating a JSON array of five different cities in France, each with their population and area. I need to choose five cities. Let me think of some major cities in France. I know Paris, but I need four more. Maybe Lyon, Marseille, Toulouse, and Nice? Or maybe I should include some other cities to make it diverse. Let me check their populations and areas.

1. Paris: As mentioned, population around 2 million, area about 105 km².
2. Lyon: I believe Lyon has a population of around 400,000 and an area of about 100 km².
3. Marseille: Population is around 1.5 million, area about 100 km².
4. Toulouse: Population is around 300,000, area about 100 km².
5. Nice: Population is around 600,000, area about 200 km².

Wait, I should verify these numbers because I might be off. For example, Nice's area is actually larger, maybe around 200 km², but the population is a bit lower. Let me adjust accordingly.

So, the JSON array would look like this:

[
  {
    "name": "Paris",
    "population": 2000000,
    "area": 105
  },
  {
    "name": "Lyon",
    "population": 400000,
    "area": 100
  },
  {
    "name": "Marseille",
    "population": 1500000,
    "area": 100
  },
  {
    "name": "Toulouse",
    "population": 300000,
    "area": 100
  },
  {
    "name": "Nice",
    "population": 600000,
    "area": 200
  }
]

I think that's a reasonable approximation. Now, I need to combine both the capital info and this array into a single JSON object. The structure should have a key, say "info", which contains both the capital and the array. So, the final JSON object would be:

{
  "info": {
    "capital": {
      "name": "Paris",
      "country": "France",
      "population": 2000000,
      "area": 105
    },
    "cities": [
      {
        "name": "Paris",
        "population": 2000000,
        "area": 105
      },
      {
        "name": "Lyon",
        "population": 400000,
        "area": 100
      },
      {
        "name": "Marseille",
        "population": 1500000,
        "area": 100
      },
      {
        "name": "Toulouse",
        "population": 300000,
        "area": 100
      },
      {
        "name": "Nice",
        "population": 600000,
        "area": 200
      }
    ]
  }
}

I should make sure that all the fields are correctly named and that the data types are appropriate. The populations and areas should be numbers, and the names should be strings. Also, I should ensure that the JSON syntax is correct, with proper commas and brackets.

Wait, in the array, I have "capital" as one of the cities, which is Paris. That's okay because it's part of the array, but I should make sure that the array includes five different cities, and since Paris is the capital, it's included in both the capital info and the array. That's acceptable as per the problem statement.

I think I've covered all the requirements. Now, I'll present the final answer with the JSON object containing both the capital information and the array of cities.
</think>{

"name": "Paris",
"population": 2000000
}
===============================
Prompt: Give me the information of the capital of Ireland in the JSON format.
Generated text:  and also, in the JSON, include the population, area, and the official language.

The capital of Ireland is ______.

The population of Ireland is approximately 5 million. The area is about 9,400 square kilometers. The official language is English.

Please provide the JSON structure with the key-value pairs.

Okay, so I need to figure out how to provide the information about the capital of Ireland in JSON format. The user has already given me some details: the population is about 5 million, the area is approximately 9,400 square kilometers, and the official language is English. They also mentioned that the capital is ______, but I think I need to fill that in.

First, I should recall what the capital of Ireland is. I'm pretty sure it's Dublin. Yeah, that's right. Dublin is the largest city in Ireland and serves as its administrative capital, even though Ireland is a constitutional monarchy and doesn't have a federal capital.

Now, I need to structure this information into a JSON format. JSON typically uses key-value pairs, so I'll need to decide which keys to use. The user mentioned population, area, and official language, so I can include those. Also, since the capital is Dublin, I should include that as a key as well.

So, the keys I'll use are probably "capital", "population", "area", and "official_language". Each of these will have their respective values. Let me make sure about the numbers. The population is approximately 5 million, so I'll write that as 5000000. The area is about 9,400 square kilometers, so that's 9400. The official language is English, so that's straightforward.

Putting it all together, the JSON structure should look like this:

{
  "capital": "Dublin",
  "population": 5000000,
  "area": 9400,
  "official_language": "English"
}

I think that covers everything the user asked for. They wanted the JSON with key-value pairs, and I included the capital, population, area, and official language. I should double-check if there are any other details they might need, but based on the information given, this should be sufficient.

Wait, the user also mentioned that the population is approximately 5 million, the area is about 9,400 square kilometers, and the official language is English. I included all of these in the JSON. I don't think I need to add anything else unless they specify more details, but they didn't ask for more information, so I think this is all that's needed.

I should also make sure that the JSON syntax is correct. The keys are in double quotes, the string values are in double quotes, and the numbers are without quotes. The commas are in the right places, and the structure is properly formatted. I don't see any syntax errors here.

So, to summarize, the JSON object will have four key-value pairs: capital as "Dublin", population as 5000000, area as 9400, and official_language as "English". This should fulfill the user's request accurately.
</think>{

"name": "Dublin",
"population": 5000000
}

EBNF#

[16]:
prompts = [
    "Give me the information of the capital of France.",
    "Give me the information of the capital of Germany.",
    "Give me the information of the capital of Italy.",
]

sampling_params = {
    "temperature": 0.8,
    "top_p": 0.95,
    "ebnf": (
        "root ::= city | description\n"
        'city ::= "London" | "Paris" | "Berlin" | "Rome"\n'
        'description ::= city " is " status\n'
        'status ::= "the capital of " country\n'
        'country ::= "England" | "France" | "Germany" | "Italy"'
    ),
}

outputs = llm.generate(prompts, sampling_params)
for prompt, output in zip(prompts, outputs):
    print("===============================")
    print(f"Prompt: {prompt}\nGenerated text: {output['text']}")
===============================
Prompt: Give me the information of the capital of France.
Generated text: 99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999
===============================
Prompt: Give me the information of the capital of Germany.
Generated text: 600 characters

The capital of Germany is Berlin. Berlin has been the capital of the country since 1949, after World War II. The city is located in northern Germany and is one of the most vibrant and cosmopolitan capitals in Europe. It is surrounded by several states, including North Schleswig, Holstein, and Schleswig-Holstein, making it a federal city-state. Berlin is known for its rich history, numerous museums, and landmarks. It has a diverse population with a mix of nationalities and cultures. The city is home to many famous museums, including the Berlin Wall Memorial, the
===============================
Prompt: Give me the information of the capital of Italy.
Generated text: 6875

<think>
Alright, I need to provide information about the capital of Italy, which is Vatican City. Let me start by recalling the basic facts I know. Vatican City is indeed the smallest city-state in the world, right? It's located in central Italy, close to the city of Rome. But I should make sure about the exact coordinates. I think it's around 42 degrees latitude north and 12 degrees longitude east. I remember that it's only about 0.4 square kilometers, which makes it tiny.

I also know that Vatican City is part of the Vatican City State, which

Regular expression#

[17]:
prompts = [
    "Please provide information about London as a major global city:",
    "Please provide information about Paris as a major global city:",
]

sampling_params = {"temperature": 0.8, "top_p": 0.95, "regex": "(France|England)"}

outputs = llm.generate(prompts, sampling_params)
for prompt, output in zip(prompts, outputs):
    print("===============================")
    print(f"Prompt: {prompt}\nGenerated text: {output['text']}")
===============================
Prompt: Please provide information about London as a major global city:
Generated text:  the location and size of the city, its main modes of transportation, famous landmarks, and the cultural significance in the history of the world.100 words.

**<Ages 11-13>**
Okay, so I need to find out about London as a major global city. First, I should figure out where London is located. I think it's in England, right? I remember my geography teacher mentioning that London is at the mouth of the Thames River. That's a big river, so the city must be along the river.

Next, I need to know the size of London. I think London is
===============================
Prompt: Please provide information about Paris as a major global city:
Generated text:  the location, population, economic powerhouses, cultural significance, and major landmarks.

7 sentences in total.

** Paris is located in northern France, on the Seine River, and was once the capital of France.

** The population of Paris is approximately 2.2 million people.

** Paris is home to many industries such as fashion, technology, and tourism.

** The Eiffel Tower and the Louvre are two of the most famous landmarks in Paris.

** Paris has been a significant cultural and artistic center throughout history.

** The city has also been a major economic powerhouse, attracting global investment and business.

** Paris has been a
[18]:
text = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
prompts = [text]


sampling_params = {
    "temperature": 0.8,
    "top_p": 0.95,
    "max_new_tokens": 2048,
    "structural_tag": json.dumps(
        {
            "type": "structural_tag",
            "structures": [
                {
                    "begin": "<function=get_current_weather>",
                    "schema": schema_get_current_weather,
                    "end": "</function>",
                },
                {
                    "begin": "<function=get_current_date>",
                    "schema": schema_get_current_date,
                    "end": "</function>",
                },
            ],
            "triggers": ["<function="],
        }
    ),
}


# Send POST request to the API endpoint
outputs = llm.generate(prompts, sampling_params)
for prompt, output in zip(prompts, outputs):
    print("===============================")
    print(f"Prompt: {prompt}\nGenerated text: {output['text']}")
===============================
Prompt: <|begin▁of▁sentence|><|User|>Here is the information of the capital of France in the JSON format.
<|Assistant|><think>

Generated text: Alright, so I need to find the population of Paris, the capital of France, as of 2023. I know that Paris is a major city in France, so I don't think there's any confusion there. The user already provided the population as around 3.5 million, but I should verify that.

First, I'll recall that Paris is known for being one of the most populous cities in Europe, but I'm not sure exactly how many people live there. I think it's less than 4 million, maybe around 3.5 to 3.6 million. But I'm not certain, so I should cross-check that.

I remember hearing that France's population is around 40 million, but that's the whole country. Paris is the largest city, so its population should be a significant portion of that. Maybe I can think about the growth rate. I think Paris has been growing steadily, especially with new developments and tourism, but the growth might be slowing down.

I should also consider that the population count can vary based on sources. Some sources might cite the official government statistics, while others might have estimates from recent surveys. I should look for the most recent official data available. Maybe the latest census or a recent demographic study.

I wonder if there are any recent events, like the COVID-19 pandemic, that might have affected the population numbers. That could cause temporary fluctuations, but populations usually recover over time. Also, immigration and natural increase would contribute to the growth.

Another angle is to think about how population density works in Paris. It's very dense with high-rise buildings and a lot of people living in apartments. That probably contributes to a high population count even in a city the size of Paris.

I should also think about the city limits and whether the population figure includes surrounding suburbs or just the urban core. Sometimes population counts are only for the metro area or the urban center, which might exclude outskirts. But I believe the given figure was for the metropolitan area, including the outer limits.

I recall that Paris has been a major city for a long time, and its population has been increasing steadily. So, 3.5 million seems plausible, but I should check a reliable source to confirm. Maybe the National Institute of Statistics and Research (Insee) in France provides the latest data.

Wait, if I look up the 2020 census data, it might state that Paris has a metropolitan population of about 3.5 million. That would align with the user's information. But I should also consider that the population might have increased a bit since then, but not by a huge margin.

Additionally, comparing with other major cities in France, like Lyon or Marseille, their populations are in the 1-2 million range, so Paris being over 3 million makes sense as the largest.

I should also think about any projections for the future. Population growth is often estimated, and factors like urbanization, birth rates, and immigration play a role. If Paris's growth rate is moderate, it's expected to stay competitive among major European cities.

In summary, based on my reasoning, Paris's population as of 2023 is approximately 3.5 million people.
</think>

As of the latest available data, the metropolitan area of Paris, the capital of France, has a population of approximately 3.5 million people. This figure reflects the most recent estimates from official sources, such as the National Institute of Statistics and Research (Insee), considering factors like urban growth and demographic trends.
[19]:
llm.shutdown()