Local AI Chatbot and Coding Assistant in your IDE

This article will help you set up a local coding assistant in Visual Studio Code and JetBrains using Ollama and Continue.

  • Ollama allows you to download and run open-source large language models. It is compatible with macOS, Linux, and Windows and continuously updates, adding support for newer models, including the new Llama 3.1.
  • Continue enables you to run your coding assistant directly inside Visual Studio Code and JetBrains.

Ollama

The simplest way is to go to https://ollama.com/ download it and install it.

Once installed, you can download (pull) models to your local machine by running the following command in your terminal:

ollama pull model-name

You can check the list of models available at https://ollama.com/library

Note

You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.[1]

Continue

Search for the extension within your IDE and install it. Then, you can select models for different tasks:

  1. Indexing: It builds vector representations of your data and indexes it for fast retrieval, context-aware chat, and autocomplete. We recommend nomic-embed-text
  2. Autocomplete: It predicts your next words, just like Github’s copilot. We recommend the deepseek-coder family ranging from 1B parameters to 15.7B for deepseek-coder-v2 and codestral 22B for users with more powerful machines.
  3. Chat: Allows you to get answers to your questions using your codebase as context. We recommend llama 3.1 8B and codestral 22B.
Asking about a function from a Python script

If you don’t like how a model performs, you can easily change it

Select other models from Ollama (autodetect)

We provide an example Continue config.json

{
    "models": [
        {
            "model": "AUTODETECT",
            "title": "Ollama",
            "apiBase": "http://localhost:11434",
            "provider": "ollama"
        }
    ],
    "tabAutocompleteModel": {
        "title": "deepseek-coder",
        "provider": "ollama",
        "model": "deepseek-coder"
    },
    "embeddingsProvider": {
        "provider": "ollama",
        "model": "nomic-embed-text"
    }
}

Where models refer to the models used by the Chat component, tabAutocompleteModel for autocomplete, and embeddingsProvider for indexing.

Notes:

  • If your machine is not powerful enough, you can use lighter models and/or only autocomplete or chat capabilities by editing the config.json file.
  • There are other extensions in Visual Studio Code and JetBrains providing similar capabilities.
  • There are ways to run your coding assistant in other IDEs.

Finally, you can fine-tune the LLM that you or your team use:

When you use Continue, you automatically generate data on how you build software. By default, this development data is saved to .continue/dev_data on your local machine. [2]

References

[1] https://github.com/ollama/ollama

[2] https://ollama.com/blog/continue-code-assistant

Get a Complimentary Planning Session