Real‑world video demo: Using different AI models in GitHub Copilot

Curious about how AI models perform in real-world scenarios with GitHub Copilot? Same. We made a live video demo to find out, and wrote up our key takeaways.

|
| 6 minutes

Claude 3.7 Sonnet, Gemini 2.5 Pro, GPT-4… developer choice is key to GitHub Copilot, and that’s especially true when it comes to picking your frontier model of choice. 

But with so many frontier generative AI models now available to use with GitHub Copilot (and more coming seemingly every day), how do you pick the right one for the job—especially with the growing capabilities of Copilot Chat, edit, ask, and agent modes?

In a recent video, I worked with GitHub’s Developer Advocate Kedasha Kerr (aka @ladykerr) to answer this exact question. Our goal? To build the same travel‑reservation app three different ways with Copilot ask, edit, and agent modes while swapping between Copilot’s growing roster of foundation models to compare each AI model in real-world development workflows. 

We set out to build a very simple travel‑reservation web app (think “browse hotel rooms, pick dates, book a room”). To keep the demo snappy, we chose a lightweight stack:

  • Backend: Flask REST API
  • Frontend: Vue.js, styled with Tailwind
  • Data: a local  data.json file instead of a real database

That gave us just enough surface area to compare models while scaffolding the app, wiring up endpoints, and adding tests, docs, and security tweaks along the way . 

Here are a few key takeaways from our video (which you should watch). 

But first, let’s talk about Copilot’s three modes

GitHub Copilot gives you three distinct “modes:” ask, edit, and agent mode.  Ask is there to answer questions, edit is a precise code‑rewriting scalpel, and agent mode can drive an entire task from your prompt to the finished commit. Think of it this way: Ask answers, edit assists, agent executes. 

What it does (nuts & bolts)Ideal moments to reach for it
Ask modeAnalyzes the code you highlight (or the context of your open file) and returns explanations, examples, or quick fixes without touching your code. No diffs, and no saving. It’s just conversational answers.• Debug a puzzling stack trace
• Refresh your memory on a library or pattern
• Grab a snippet or algorithm on the fly
Edit modeYou select one or more files, describe a change in a plain-language prompt, and Copilot applies inline edits across those files. But first, it shows you a diff, so you can approve every change.• Add error handling or refactor repetitive code
• Tight, multi‑file tweaks in a brown‑field codebase
• Apply team style rules via custom instructions
Agent modeFeed it a high‑level prompt and Copilot plans steps, runs terminal commands, edits multiple files, and keeps iterating autonomously while surfacing risky commands for review. Great for project‑wide, multi‑step work.• Scaffold a new service or feature from a README
• Large bug fixes that touch many modules
• Automated clean ups (e.g., migrate to Tailwind everywhere)

Tip 1: No matter what model you use, context matters more than you think

The model you use is far from the only variable, and the context you offer your model of choice is often one of the most important elements. 

That means the way you shape your prompt—and the context you provide Copilot with your prompt and additional files—makes a big difference in output quality. By toggling between capabilities, such as Copilot agent or edit mode, and switching models mid-session, we explored how Copilot responds when fed just the right amount of detail—or when asked to think a few steps ahead.

Our demo underscores that different modes impact results, and thoughtful prompting can dramatically change a model’s behavior (especially in complex or ambiguous coding tasks). 

The takeaway: If you’re not shaping your prompts and context deliberately, you’re probably leaving performance on the table.

For a deeper dive into model choice, the guide “Which AI model should I use with GitHub Copilot?” offers a comprehensive breakdown.

Tip 2: Copilot agent mode is a powerful tool

Agent mode, which is still relatively new and evolving fast, allows Copilot to operate more autonomously by navigating files, making changes, and performing repository-wide tasks with minimal hand holding. 

This mode opens up new workflow possibilities (especially for repetitive or large-scale changes). But it also demands a different kind of trust and supervision. Seeing it in action helps demystify where it fits in your workflows.

Here are two ways we used agent mode in our demo: 

  • One‑click project scaffolding: Kedasha highlighted the project README and simply told Copilot “implement this.” Agent mode (running Gemini 2.5 Pro) created the entire Flask and Vue repository with directories, boiler‑plate code, unit tests, and even seeded data. 
  • End‑to‑end technical docs: I started using agent mode with Claude 3.5 and prompted: “Make documentation for this app … include workflow diagrams in Mermaid.” Copilot generated a polished README, API reference, and two Mermaid sequence/flow diagrams, then opened a preview so I could render the charts before committing .

Tip 3: Use custom instructions to set your ground rules

Another insight from the session is just how much mileage you can get from customizing Copilot’s behavior with custom instructions. 

If you don’t know, custom instructions let you lay down the rules before Copilot suggests anything (like how APIs need to be called, naming conventions, and style standards). 

Kedasha in particular underscored how custom instructions can tailor tone, code style, and task focus to fit your workflow—or your team’s. 

One example? Using custom instructions to give every model the same ground rules, so swaps between each model produced consistent, secure code without re‑explaining standards each time.

Whether you’re nudging Copilot to avoid over-explaining, stick to a certain stack, or adopt a consistent commenting voice, the customization options are more powerful than most people realize. If you haven’t personalized Copilot yet, try custom instructions (and check out our Docs on them to get started).

Tip 4: The balance between speed vs. output quality

No matter what model you use, there are always tradeoffs between responsiveness, completeness, and confidence. A larger model may not provide quick suggestions when you’re working through an edit, for instance—but a smaller model may not offer the best refactoring suggestions, even if it’s faster in practice. 

TL;DR: It’s not about chasing the “best” model—it’s about knowing when to switch, and why. Your default model might work 80% of the time—but having others on deck lets you handle edge cases faster and better.

Take this with you

This video demo isn’t a scripted feature demo. It’s two devs using Copilot the way you would—navigating unknowns, poking at what’s possible, and figuring out how to get better results by working smarter instead of harder.

If you’ve been sticking with the defaults or haven’t explored multi-model workflows, this is your invitation to take things further.

👉 Watch the full video to see how we put Copilot to work—and got more out of every mode, prompt, and model. 

Now—what will you build? Try GitHub Copilot to get started (we have a free tier that’s pretty great, too). 

Additional resources: 

Written by

Related posts