{"id":31150,"date":"2026-06-12T16:24:05","date_gmt":"2026-06-12T14:24:05","guid":{"rendered":"https:\/\/contabo.com\/blog\/?p=31150"},"modified":"2026-06-12T16:36:06","modified_gmt":"2026-06-12T14:36:06","slug":"litellm-vs-ai-gateways","status":"publish","type":"post","link":"https:\/\/contabo.com\/blog\/litellm-vs-ai-gateways\/","title":{"rendered":"LiteLLM vs Portkey, Kong &amp; Cloudflare: AI Gateways Compared"},"content":{"rendered":"\n<p>If LiteLLM is the open-source, self-hosted default for routing LLM traffic, the main alternatives each lean a different way: Portkey adds governance and guardrails, Kong AI Gateway fits enterprises already running Kong, and Cloudflare AI Gateway is the managed, ecosystem-native option. This compares all four \u2014 and clears up where vLLM and Ollama actually fit, because they&#8217;re not gateways at all.<\/p>\n\n\n\n<div class=\"wp-block-uagb-advanced-heading uagb-block-34352a21\"><h2 class=\"uagb-heading-text\">Quick Verdict<\/h2><\/div>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Pick LiteLLM if you want an open-source gateway you self-host and fully own.<\/li>\n\n\n\n<li>Pick Portkey if governance, guardrails, and deep observability are priorities.<\/li>\n\n\n\n<li>Pick Kong AI Gateway if you already operate a Kong API mesh.<\/li>\n\n\n\n<li>Pick Cloudflare AI Gateway if your app lives in the Cloudflare ecosystem and you want zero ops.<\/li>\n<\/ul>\n\n\n\n<div class=\"wp-block-uagb-advanced-heading uagb-block-0caf2518\"><h2 class=\"uagb-heading-text\">AI Gateways Compared at a Glance<\/h2><\/div>\n\n\n\n<!-- AI Gateways Compared at a Glance -->\n<figure class=\"wp-block-table is-style-stripes\" style=\"overflow-x:auto;\">\n  <table style=\"width:100%;border-collapse:collapse;font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;font-size:15px;line-height:1.45;\">\n    <thead>\n      <tr style=\"background:#f3f4f6;text-align:left;\">\n        <th scope=\"col\" style=\"padding:10px 12px;border-bottom:2px solid #d1d5db;\">Gateway<\/th>\n        <th scope=\"col\" style=\"padding:10px 12px;border-bottom:2px solid #d1d5db;\">Self-hosted?<\/th>\n        <th scope=\"col\" style=\"padding:10px 12px;border-bottom:2px solid #d1d5db;\">Open-source?<\/th>\n        <th scope=\"col\" style=\"padding:10px 12px;border-bottom:2px solid #d1d5db;\">Guardrails<\/th>\n        <th scope=\"col\" style=\"padding:10px 12px;border-bottom:2px solid #d1d5db;\">Ops overhead<\/th>\n        <th scope=\"col\" style=\"padding:10px 12px;border-bottom:2px solid #d1d5db;\">Best for<\/th>\n      <\/tr>\n    <\/thead>\n    <tbody>\n      <tr>\n        <th scope=\"row\" style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;text-align:left;font-weight:600;\">LiteLLM<\/th>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Yes<\/td>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Yes (MIT)<\/td>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Basic<\/td>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Low<\/td>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Self-hosted, open-source teams<\/td>\n      <\/tr>\n      <tr>\n        <th scope=\"row\" style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;text-align:left;font-weight:600;\">Portkey<\/th>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Limited<\/td>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Core open + managed<\/td>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Strong (semantic caching, guardrails)<\/td>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Low (managed)<\/td>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Governance &amp; observability<\/td>\n      <\/tr>\n      <tr>\n        <th scope=\"row\" style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;text-align:left;font-weight:600;\">Kong AI Gateway<\/th>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Yes<\/td>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Core open<\/td>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Via plugins<\/td>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">High<\/td>\n        <td style=\"padding:10px 12px;border-bottom:1px solid #e5e7eb;\">Existing Kong \/ enterprise<\/td>\n      <\/tr>\n      <tr>\n        <th scope=\"row\" style=\"padding:10px 12px;text-align:left;font-weight:600;\">Cloudflare AI Gateway<\/th>\n        <td style=\"padding:10px 12px;\">No<\/td>\n        <td style=\"padding:10px 12px;\">No<\/td>\n        <td style=\"padding:10px 12px;\">Platform features<\/td>\n        <td style=\"padding:10px 12px;\">Near-zero<\/td>\n        <td style=\"padding:10px 12px;\">Cloudflare-ecosystem apps<\/td>\n      <\/tr>\n    <\/tbody>\n  <\/table>\n<\/figure>\n\n\n\n<div class=\"wp-block-uagb-advanced-heading uagb-block-6d974ec9\"><h3 class=\"uagb-heading-text\">LiteLLM vs Portkey<\/h3><\/div>\n\n\n\n<p>Kong AI Gateway brings LLM routing into Kong&#8217;s mature API-management platform, with a strong plugin ecosystem, SSO, and capabilities like PII redaction. That power comes with weight: it&#8217;s heavier to operate and assumes Kong infrastructure underneath. LiteLLM is the lighter, simpler, more self-contained option. The honest rule of thumb: choose Kong if you already run a Kong API mesh and want LLM traffic governed the same way; otherwise LiteLLM is far less to operate.<\/p>\n\n\n\n<div class=\"wp-block-uagb-advanced-heading uagb-block-c9d511f5\"><h3 class=\"uagb-heading-text\">LiteLLM vs Cloudflare AI Gateway<\/h3><\/div>\n\n\n\n<p>Cloudflare AI Gateway is fully managed with near-zero operational overhead, adding caching and analytics in front of your providers \u2014 and it shines when your application already lives in the Cloudflare ecosystem. The trade-off is the familiar managed one: you give up some data control in exchange for not running anything. LiteLLM is the opposite choice \u2014 you operate it, and in return the full data path stays in your infrastructure.<\/p>\n\n\n\n<div class=\"wp-block-uagb-advanced-heading uagb-block-ce22adb1\"><h3 class=\"uagb-heading-text\">Where vLLM and Ollama Fit (They&#8217;re Not Gateways)<\/h3><\/div>\n\n\n\n<p>This is the comparison people most often get wrong. vLLM and Ollama are not gateways \u2014 they&#8217;re inference engines and runtimes that actually run the models. A gateway like LiteLLM sits in front of them: it routes a request to a vLLM or Ollama backend exactly as it would to a cloud provider. So &#8220;LiteLLM vs vLLM&#8221; or &#8220;LiteLLM vs Ollama&#8221; is a category error; the real pattern is using them together \u2014 a gateway for routing and control, a runtime for the inference. If self-hosted inference is your goal, the linked Ollama guides are the place to start.<\/p>\n\n\n\n<div class=\"wp-block-uagb-advanced-heading uagb-block-c2f605ae\"><h2 class=\"uagb-heading-text\">Which AI Gateway Should You Choose?<\/h2><\/div>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Open-source and self-hosted \u2192 LiteLLM.<\/li>\n\n\n\n<li>Governance, guardrails, observability \u2192 Portkey.<\/li>\n\n\n\n<li>Already running Kong \/ enterprise needs \u2192 Kong AI Gateway.<\/li>\n\n\n\n<li>Cloudflare-ecosystem app, zero ops \u2192 Cloudflare AI Gateway.<\/li>\n\n\n\n<li>Running your own models \u2192 pair LiteLLM (gateway) with vLLM or Ollama (runtime).<\/li>\n<\/ul>\n\n\n\n<div class=\"wp-block-uagb-advanced-heading uagb-block-bf532f25\"><h2 class=\"uagb-heading-text\">How to Self-Host LiteLLM on a VPS<\/h2><\/div>\n\n\n\n<p>Of these, LiteLLM is the most self-host-friendly \u2014 a CPU-bound proxy with a small PostgreSQL database that runs comfortably on a modest virtual private server. A VPS gives you root access for Docker, full data control, and EU data-residency options; if you also want to self-host the models, you can pair the gateway with a GPU instance running vLLM or Ollama. Contabo&#8217;s Core VPS line offers strong RAM-per-Euro value for the gateway, with GPU options available for inference. See the linked Docker setup guide to deploy.<\/p>\n\n\n\n<div class=\"wp-block-uagb-advanced-heading uagb-block-7e8a212f\"><h2 class=\"uagb-heading-text\">FAQ: LiteLLM vs Other AI Gateways<\/h2><\/div>\n\n\n\n<div class=\"schema-faq wp-block-yoast-faq-block\"><div class=\"schema-faq-section\" id=\"faq-question-1780487018130\"><strong class=\"schema-faq-question\">What is the best alternative to LiteLLM?<\/strong> <p class=\"schema-faq-answer\">It depends on what you need. Portkey is the closest alternative if you want governance and guardrails as a managed control plane. Kong AI Gateway suits enterprises already on Kong, and Cloudflare AI Gateway fits managed, ecosystem-native setups. For open-source self-hosting specifically, LiteLLM remains the leading choice.<\/p> <\/div> <div class=\"schema-faq-section\" id=\"faq-question-1780487030561\"><strong class=\"schema-faq-question\">LiteLLM vs Portkey \u2014 which is better?<\/strong> <p class=\"schema-faq-answer\">Neither is universally better. LiteLLM is better if you want an open-source gateway you self-host and own. Portkey is better if you want managed governance, guardrails, semantic caching, and observability without running the infrastructure. The choice is mainly about delivery model: self-hosted ownership versus a managed control plane.<\/p> <\/div> <div class=\"schema-faq-section\" id=\"faq-question-1780487043423\"><strong class=\"schema-faq-question\">Is vLLM an alternative to LiteLLM?<\/strong> <p class=\"schema-faq-answer\">No. vLLM is an inference engine that runs models, while LiteLLM is a gateway that routes requests to model backends. They operate at different layers and are typically used together \u2014 LiteLLM in front, routing to a vLLM backend. So vLLM complements LiteLLM rather than replacing it.<\/p> <\/div> <div class=\"schema-faq-section\" id=\"faq-question-1780487062254\"><strong class=\"schema-faq-question\">Is Ollama an LLM gateway?<\/strong> <p class=\"schema-faq-answer\">No. Ollama is a local runtime for running models on your own hardware, not a gateway. A gateway such as LiteLLM can sit in front of Ollama and route requests to it alongside other providers. If you want local inference, you&#8217;d use Ollama as a backend behind your gateway, not instead of it.<\/p> <\/div> <\/div>\n","protected":false},"excerpt":{"rendered":"<p>If LiteLLM is the open-source, self-hosted default for routing LLM traffic, the main alternatives each lean a different way: Portkey adds governance and guardrails, Kong AI Gateway fits enterprises already running Kong, and Cloudflare AI Gateway is the managed, ecosystem-native option. This compares all four \u2014 and clears up where vLLM and Ollama actually fit, [&hellip;]<\/p>\n","protected":false},"author":78,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"_uag_custom_page_level_css":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[1535],"tags":[4459,4465,4462,4463,4466,3295,4460,3319],"ppma_author":[4285],"class_list":["post-31150","post","type-post","status-publish","format-standard","hentry","category-comparisons","tag-ai-gateway","tag-cloudflare-ai-gateway","tag-kong-ai-gateway","tag-llm-gateway","tag-llm-routing","tag-ollama","tag-portkey","tag-self-hosted-ai"],"uagb_featured_image_src":{"full":false,"thumbnail":false,"medium":false,"medium_large":false,"large":false,"1536x1536":false,"2048x2048":false},"uagb_author_info":{"display_name":"Jie Guo","author_link":"https:\/\/contabo.com\/blog\/author\/jieguo\/"},"uagb_comment_info":0,"uagb_excerpt":"If LiteLLM is the open-source, self-hosted default for routing LLM traffic, the main alternatives each lean a different way: Portkey adds governance and guardrails, Kong AI Gateway fits enterprises already running Kong, and Cloudflare AI Gateway is the managed, ecosystem-native option. This compares all four \u2014 and clears up where vLLM and Ollama actually fit,&hellip;","authors":[{"term_id":4285,"user_id":78,"is_guest":0,"slug":"jieguo","display_name":"Jie Guo","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/4e0d981b06988d6d456834e9d55bc9e713e918fa8444325543d14f448154106b?s=96&d=mm&r=g","author_category":"","user_url":"","last_name":"Guo","first_name":"Jie","job_title":"","description":""}],"_links":{"self":[{"href":"https:\/\/contabo.com\/blog\/wp-json\/wp\/v2\/posts\/31150","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/contabo.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/contabo.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/contabo.com\/blog\/wp-json\/wp\/v2\/users\/78"}],"replies":[{"embeddable":true,"href":"https:\/\/contabo.com\/blog\/wp-json\/wp\/v2\/comments?post=31150"}],"version-history":[{"count":1,"href":"https:\/\/contabo.com\/blog\/wp-json\/wp\/v2\/posts\/31150\/revisions"}],"predecessor-version":[{"id":31151,"href":"https:\/\/contabo.com\/blog\/wp-json\/wp\/v2\/posts\/31150\/revisions\/31151"}],"wp:attachment":[{"href":"https:\/\/contabo.com\/blog\/wp-json\/wp\/v2\/media?parent=31150"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/contabo.com\/blog\/wp-json\/wp\/v2\/categories?post=31150"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/contabo.com\/blog\/wp-json\/wp\/v2\/tags?post=31150"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/contabo.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=31150"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}