Pros and Cons of using a Model Router
Model routers for LLMs: when they shine, when they fail, how to evaluate them, and a simple starter approach
Model routers for LLMs: when they shine, when they fail, how to evaluate them, and a simple starter approach
A non-comprehensive but still awesome list of AI development tools – IDEs, extensions, CLIs, and asynchronous coding agents
A step-by-step guide to configure aider and Continue with Azure-hosted o3-mini and DeepSeek-R1 LLMs for AI-assisted development
I’ve just released v0.7 of Grabit, my little command line app for saving full-text copies of webpages. It brings support for saving Reddit posts (I really wanted to do this), and custom user agents (I didn’t really want to do this, but here we are). It also prettifies the markdown, to make sure it looks just the way it should, nobody likes 10 blank rows before every bulletpoint. Using o1 to automate the boring parts One more interesting thing is that I’m experimenting with using an LLM to help me keep the README in sync with the new changes, and in general help me automate the boring parts of releasing a new version. ...
How to integrate smolagents with Azure OpenAI to build Python-driven AI agents. Also, lots of ducks.
How Azure OpenAI’s prompt caching feature works, its benefits, caveats, and a quick experiment
Generative AI models I like
Understand the differences in pricing between Azure OpenAI and OpenAI for fine-tuning AI models, with a detailed analysis of token and hosting costs.
I heard you like OpenAI, so I used OpenAI’s Whisper to transcribe the OpenAI DevDay Keynote, OpenAI GPT-4 Turbo to summarize the transcript, come up with ideas that illustrate the main points and generate DALL-E prompts for said ideas, OpenAI DALL·E 3 to generate the images, and OpenAI Text to Speech to narrate the summary. Xzibit would be like, so proud.
About Orca-2 The fine folk at Microsoft Research have recently published Orca 2, a new small large language model and apparently, it’s quite good! Just look at the test results below – on average, both the 7B and the 13B variants are significantly better than Llama-2-Chat-70B, with Orca-2-13B superseding even WizardLM-70B. Pretty cool! 🚀 I also love the idea behind it: prompting a big large language model (in our case GPT-4) to answer some rather convoluted logic questions while aided by some very specific system prompts, and then fine-tune a smaller model (Llama-2-7B and 13B respectively) on just the question and answer pairs, leaving out the detailed system prompts. ...