> ## Documentation Index
> Fetch the complete documentation index at: https://sourcebot-whoisthey-language-model-input-modalities.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Linking code from GitHub

export const feature_0 = "GitHub App"

export const verb_0 = undefined

Sourcebot can sync code from GitHub.com, GitHub Enterprise Server, and GitHub Enterprise Cloud.

If you're not familiar with Sourcebot [connections](/docs/connections/indexing-your-code), please read that overview first.

## Examples

<AccordionGroup>
  <Accordion title="Sync individual repos">
    ```json theme={null}
    {
        "type": "github",
        "repos": [
            "sourcebot-dev/sourcebot",
            "getsentry/sentry",
            "torvalds/linux"
        ]
    }
    ```
  </Accordion>

  <Accordion title="Sync all repos in a organization">
    ```json theme={null}
    {
        "type": "github",
        "orgs": [
            "sourcebot-dev",
            "getsentry",
            "vercel"
        ]
    }
    ```
  </Accordion>

  <Accordion title="Sync all repos owned by a user">
    ```json theme={null}
    {
        "type": "github",
        "users": [
            "torvalds",
            "ggerganov"
        ]
    }
    ```
  </Accordion>

  <Accordion title="Filter repos by topic">
    ```json theme={null}
    {
        "type": "github",
        // Sync all repos in `my-org` that have a topic that...
        "orgs": [
            "my-org"
        ],
        // ...match one of these glob patterns.
        "topics": [
            "test-*",
            "ci-*",
            "k8s"
        ]
    }

    ```
  </Accordion>

  <Accordion title="Exclude repos from syncing">
    ```json theme={null}
    {
        "type": "github",
        // Include all repos in my-org...
        "orgs": [
            "my-org"
        ],
        // ...except:
        "exclude": {
            // repos that are archived
            "archived": true,
            // repos that are forks
            "forks": true,
            // repos that match these glob patterns
            "repos": [
                "my-org/repo1",
                "my-org/repo2",
                "my-org/sub-org-1/**",
                "my-org/sub-org-*/**"
            ],
            "size": {
                // repos that are less than 1MB (in bytes)...
                "min": 1048576,
                // or repos greater than 100MB (in bytes)
                "max": 104857600 
            },
            // repos with topics that match these glob patterns
            "topics": [
                "test-*",
                "ci"
            ]
        }
    }
    ```
  </Accordion>
</AccordionGroup>

## Authenticating with GitHub

In order to index private repositories, you'll need to authenticate with GitHub. Sourcebot supports the following mechanisms of authenticating a GitHub connection:

<AccordionGroup>
  <Accordion title="GitHub App">
    <Note>
      {feature_0} {verb_0 ?? "is"} only available in a paid plan. Please activate a [license key](/docs/activating-a-subscription) to use this feature.
    </Note>

    <Steps>
      <Step title="Register a new GitHub App">
        Register a new [GitHub App](https://docs.github.com/en/apps/creating-github-apps/registering-a-github-app/registering-a-github-app#registering-a-github-app) and provide it with the following permissions:

        * “Contents” repository permissions (read)
        * “Metadata” repository permissions (read)
        * “Members” organization permissions (read)
        * “Email addresses” account permissions (read)

        This can be the same GitHub App you've registered and configured as an [external identity provider](/docs/configuration/idp#github)
      </Step>

      <Step title="Install the GitHub App">
        Install the GitHub App into the GitHub orgs that you want Sourcebot to be aware of. **Sourcebot will only be able to index repos from orgs with the GitHub App installed.**
      </Step>

      <Step title="Create a private key for the GitHub App">
        Create a [private key](https://docs.github.com/en/apps/creating-github-apps/authenticating-with-a-github-app/managing-private-keys-for-github-apps) for the GitHub App.
      </Step>

      <Step title="Define the GitHub App config in Sourcebot">
        Create a new `apps` object in the Sourcebot [config file](/docs/configuration/config-file). The private key you created in the previous
        step must be passed in as a [token](/docs/configuration/config-file#tokens).

        ```json wrap icon="code" theme={null}
            "apps": [
                {
                    "type": "github", // must be github
                    "id": "1234567", // Your GitHub App ID
                    "privateKey": {
                        "env": "GITHUB_APP_PRIVATE_KEY" // Token which contains your Github App private key
                    }
                }
            ]
        ```
      </Step>

      <Step title="You're done!">
        That's it! Sourcebot will now use this GitHub App to authenticate when pulling repos for this connection.
      </Step>
    </Steps>
  </Accordion>

  <Accordion title="Fine-grained personal access tokens">
    <Steps>
      <Step title="Create PAT">
        Create a new fine-grained PAT [here](https://github.com/settings/personal-access-tokens/new). Select the resource owner and the repositories that you want Sourcebot to have access to.

        Next, under "Repository permissions", select permissions `Contents` and `Metadata` with access `Read-only`. The permissions should look like the following:

        <img src="https://mintcdn.com/sourcebot-whoisthey-language-model-input-modalities/UI3at4lP3VZBMCMb/images/github_pat_scopes_fine_grained.png?fit=max&auto=format&n=UI3at4lP3VZBMCMb&q=85&s=d67cffe3069a68607bd63cbee8ce21d7" alt="GitHub PAT Scope" width="1246" height="248" data-path="images/github_pat_scopes_fine_grained.png" />

        [GitHub docs](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens#fine-grained-personal-access-tokens)
      </Step>

      <Step title="Pass PAT into Sourcebot">
        Next, provide the PAT via a [token](/docs/configuration/config-file#tokens) which is referenced in the `token` field in the [connection](/docs/connections/indexing-your-code) config object.

        The most common mechanism of doing this is defining an environment variable that holds the PAT:

        ```json theme={null}
        {
            "type": "github",
            "token": {
                // note: this env var can be named anything. It
                // doesn't need to be `GITHUB_TOKEN`.
                "env": "GITHUB_TOKEN"
            },
            // At least one of the following is required to specify which repos to sync:
            "repos": ["my-org/myRepo"],
            // "orgs": ["my-org"],
            // "users": ["my-user"]
        }
        ```
      </Step>

      <Step title="You're done!">
        That's it! Sourcebot will now use this PAT to authenticate when pulling repos for this connection.
      </Step>
    </Steps>
  </Accordion>

  <Accordion title="Personal access tokens (classic)">
    <Steps>
      <Step title="Create PAT">
        Create a new PAT [here](https://github.com/settings/tokens/new) and make sure you select the `repo` scope:

        <img src="https://mintcdn.com/sourcebot-whoisthey-language-model-input-modalities/UI3at4lP3VZBMCMb/images/github_pat_scopes.png?fit=max&auto=format&n=UI3at4lP3VZBMCMb&q=85&s=00747544765ed07676089281b66eb55a" alt="GitHub PAT Scope" width="1572" height="364" data-path="images/github_pat_scopes.png" />

        [GitHub docs](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens#personal-access-tokens-classic)
      </Step>

      <Step title="Pass PAT into Sourcebot">
        Next, provide the PAT via a [token](/docs/configuration/config-file#tokens) which is referenced in the `token` field in the [connection](/docs/connections/indexing-your-code) config object.

        The most common mechanism of doing this is defining an environment variable that holds the PAT:

        ```json theme={null}
        {
            "type": "github",
            "token": {
                // note: this env var can be named anything. It
                // doesn't need to be `GITHUB_TOKEN`.
                "env": "GITHUB_TOKEN"
            },
            // At least one of the following is required to specify which repos to sync:
            "repos": ["my-org/myRepo"],
            // "orgs": ["my-org"],
            // "users": ["my-user"]
        }
        ```
      </Step>

      <Step title="You're done!">
        That's it! Sourcebot will now use this PAT to authenticate when pulling repos for this connection.
      </Step>
    </Steps>
  </Accordion>
</AccordionGroup>

## Connecting to a custom GitHub host

To connect to a GitHub host other than `github.com`, provide the `url` property to your config:

```json theme={null}
{
    "type": "github",
    "url": "https://github.example.com",
    // At least one of the following is required to specify which repos to sync:
    "repos": ["my-org/myRepo"],
    // "orgs": ["my-org"],
    // "users": ["my-user"]
}
```

## Schema reference

<Accordion title="Reference">
  [schemas/v3/github.json](https://github.com/sourcebot-dev/sourcebot/blob/main/schemas/v3/github.json)

  ```json theme={null}
  {
    "$schema": "http://json-schema.org/draft-07/schema#",
    "type": "object",
    "title": "GithubConnectionConfig",
    "properties": {
      "type": {
        "const": "github",
        "description": "GitHub Configuration"
      },
      "token": {
        "description": "A Personal Access Token (PAT).",
        "anyOf": [
          {
            "type": "object",
            "properties": {
              "env": {
                "type": "string",
                "description": "The name of the environment variable that contains the token."
              }
            },
            "required": [
              "env"
            ],
            "additionalProperties": false
          },
          {
            "type": "object",
            "properties": {
              "googleCloudSecret": {
                "type": "string",
                "description": "The resource name of a Google Cloud secret. Must be in the format `projects/<project-id>/secrets/<secret-name>/versions/<version-id>`. See https://cloud.google.com/secret-manager/docs/creating-and-accessing-secrets"
              }
            },
            "required": [
              "googleCloudSecret"
            ],
            "additionalProperties": false
          }
        ]
      },
      "url": {
        "type": "string",
        "format": "url",
        "default": "https://github.com",
        "description": "The URL of the GitHub host. Defaults to https://github.com",
        "examples": [
          "https://github.com",
          "https://github.example.com"
        ],
        "pattern": "^https?:\\/\\/[^\\s/$.?#].[^\\s]*$"
      },
      "users": {
        "type": "array",
        "items": {
          "type": "string",
          "pattern": "^[\\w.-]+$"
        },
        "default": [],
        "examples": [
          [
            "torvalds",
            "DHH"
          ]
        ],
        "description": "List of users to sync with. All repositories that the user owns will be synced, unless explicitly defined in the `exclude` property."
      },
      "orgs": {
        "type": "array",
        "items": {
          "type": "string",
          "pattern": "^[\\w.-]+$"
        },
        "default": [],
        "examples": [
          [
            "my-org-name"
          ],
          [
            "sourcebot-dev",
            "commaai"
          ]
        ],
        "description": "List of organizations to sync with. All repositories in the organization visible to the provided `token` (if any) will be synced, unless explicitly defined in the `exclude` property."
      },
      "repos": {
        "type": "array",
        "items": {
          "type": "string",
          "pattern": "^[\\w.-]+\\/[\\w.-]+$"
        },
        "default": [],
        "description": "List of individual repositories to sync with. Expected to be formatted as '{orgName}/{repoName}' or '{userName}/{repoName}'."
      },
      "topics": {
        "type": "array",
        "items": {
          "type": "string"
        },
        "minItems": 1,
        "default": [],
        "description": "List of repository topics to include when syncing. Only repositories that match at least one of the provided `topics` will be synced. If not specified, all repositories will be synced, unless explicitly defined in the `exclude` property. Glob patterns are supported.",
        "examples": [
          [
            "docs",
            "core"
          ]
        ]
      },
      "exclude": {
        "type": "object",
        "properties": {
          "forks": {
            "type": "boolean",
            "default": false,
            "description": "Exclude forked repositories from syncing."
          },
          "archived": {
            "type": "boolean",
            "default": false,
            "description": "Exclude archived repositories from syncing."
          },
          "repos": {
            "type": "array",
            "items": {
              "type": "string"
            },
            "default": [],
            "description": "List of individual repositories to exclude from syncing. Glob patterns are supported."
          },
          "topics": {
            "type": "array",
            "items": {
              "type": "string"
            },
            "default": [],
            "description": "List of repository topics to exclude when syncing. Repositories that match one of the provided `topics` will be excluded from syncing. Glob patterns are supported.",
            "examples": [
              [
                "tests",
                "ci"
              ]
            ]
          },
          "size": {
            "type": "object",
            "description": "Exclude repositories based on their disk usage. Note: the disk usage is calculated by GitHub and may not reflect the actual disk usage when cloned.",
            "properties": {
              "min": {
                "type": "integer",
                "description": "Minimum repository size (in bytes) to sync (inclusive). Repositories less than this size will be excluded from syncing."
              },
              "max": {
                "type": "integer",
                "description": "Maximum repository size (in bytes) to sync (inclusive). Repositories greater than this size will be excluded from syncing."
              }
            },
            "additionalProperties": false
          }
        },
        "additionalProperties": false
      },
      "revisions": {
        "type": "object",
        "description": "The revisions (branches, tags) that should be included when indexing. The default branch (HEAD) is always indexed. A maximum of 64 revisions can be indexed, with any additional revisions being ignored.",
        "properties": {
          "branches": {
            "type": "array",
            "description": "List of branches to include when indexing. For a given repo, only the branches that exist on the repo's remote *and* match at least one of the provided `branches` will be indexed. The default branch (HEAD) is always indexed. Glob patterns are supported. A maximum of 64 branches can be indexed, with any additional branches being ignored.",
            "items": {
              "type": "string"
            },
            "examples": [
              [
                "main",
                "release/*"
              ],
              [
                "**"
              ]
            ],
            "default": []
          },
          "tags": {
            "type": "array",
            "description": "List of tags to include when indexing. For a given repo, only the tags that exist on the repo's remote *and* match at least one of the provided `tags` will be indexed. Glob patterns are supported. A maximum of 64 tags can be indexed, with any additional tags being ignored.",
            "items": {
              "type": "string"
            },
            "examples": [
              [
                "latest",
                "v2.*.*"
              ],
              [
                "**"
              ]
            ],
            "default": []
          }
        },
        "additionalProperties": false
      },
      "enforcePermissions": {
        "type": "boolean",
        "description": "Controls whether repository permissions are enforced for this connection. When `PERMISSION_SYNC_ENABLED` is false, this setting has no effect. Defaults to the value of `PERMISSION_SYNC_ENABLED`. See https://docs.sourcebot.dev/docs/features/permission-syncing"
      },
      "enforcePermissionsForPublicRepos": {
        "type": "boolean",
        "default": false,
        "description": "Controls whether repository permissions are enforced for public repositories in this connection. When true, public repositories are only visible to users with a linked account for this connection's code host. When false, public repositories are visible to all users. Has no effect when enforcePermissions is false. Defaults to false. See https://docs.sourcebot.dev/docs/features/permission-syncing"
      }
    },
    "required": [
      "type"
    ],
    "additionalProperties": false
  }
  ```
</Accordion>

## See also

* [Syncing GitHub Access permissions to Sourcebot](/docs/features/permission-syncing#github)
