Local AI Chatbot and Coding Assistant in your IDE

This article will help you set up a local coding assistant in Visual Studio Code and JetBrains using Ollama and Continue.

Ollama allows you to download and run open-source large language models. It is compatible with macOS, Linux, and Windows and continuously updates, adding support for newer models, including the new Llama 3.1.
Continue enables you to run your coding assistant directly inside Visual Studio Code and JetBrains.

Ollama

The simplest way is to go to https://ollama.com/ download it and install it.

Once installed, you can download (pull) models to your local machine by running the following command in your terminal:

ollama pull model-name

You can check the list of models available at https://ollama.com/library

Note

You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.[1]

Continue

Search for the extension within your IDE and install it. Then, you can select models for different tasks:

Indexing: It builds vector representations of your data and indexes it for fast retrieval, context-aware chat, and autocomplete. We recommend nomic-embed-text
Autocomplete: It predicts your next words, just like Github’s copilot. We recommend the deepseek-coder family ranging from 1B parameters to 15.7B for deepseek-coder-v2 and codestral 22B for users with more powerful machines.
Chat: Allows you to get answers to your questions using your codebase as context. We recommend llama 3.1 8B and codestral 22B.

Asking about a function from a Python script

If you don’t like how a model performs, you can easily change it

Select other models from Ollama (autodetect)

We provide an example Continue config.json

{
    "models": [
        {
            "model": "AUTODETECT",
            "title": "Ollama",
            "apiBase": "http://localhost:11434",
            "provider": "ollama"
        }
    ],
    "tabAutocompleteModel": {
        "title": "deepseek-coder",
        "provider": "ollama",
        "model": "deepseek-coder"
    },
    "embeddingsProvider": {
        "provider": "ollama",
        "model": "nomic-embed-text"
    }
}

Where models refer to the models used by the Chat component, tabAutocompleteModel for autocomplete, and embeddingsProvider for indexing.

Notes:

If your machine is not powerful enough, you can use lighter models and/or only autocomplete or chat capabilities by editing the config.json file.
There are other extensions in Visual Studio Code and JetBrains providing similar capabilities.
There are ways to run your coding assistant in other IDEs.

Finally, you can fine-tune the LLM that you or your team use:

When you use Continue, you automatically generate data on how you build software. By default, this development data is saved to .continue/dev_data on your local machine. [2]

References

[1] https://github.com/ollama/ollama

[2] https://ollama.com/blog/continue-code-assistant