summaryrefslogtreecommitdiff
path: root/packages/integrations/sitemap/README.md
blob: 6272feaa75e2ba7abcc06982f0c0ac26c9017ffe (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
# @astrojs/sitemap 🗺

This **[Astro integration][astro-integration]** generates a sitemap based on your pages when you build your Astro project.

- <strong>[Why Astro Sitemap](#why-astro-sitemap)</strong>
- <strong>[Installation](#installation)</strong>
- <strong>[Usage](#usage)</strong>
- <strong>[Configuration](#configuration)</strong>
- <strong>[Examples](#examples)</strong>
- <strong>[Troubleshooting](#troubleshooting)</strong>
- <strong>[Contributing](#contributing)</strong>
- <strong>[Changelog](#changelog)</strong>

## Why Astro Sitemap

A Sitemap is an XML file that outlines all of the pages, videos, and files on your site. Search engines like Google read this file to crawl your site more efficiently. [See Google's own advice on sitemaps](https://developers.google.com/search/docs/advanced/sitemaps/overview) to learn more.

A sitemap file is recommended for large multi-page sites. If you don't use a sitemap, most search engines will still be able to list your site's pages, but a sitemap is a great way to ensure that your site is as search engine friendly as possible.

With Astro Sitemap, you don't have to worry about creating this XML file yourself: the Astro Sitemap integration will crawl your statically-generated routes and create the sitemap file, including [dynamic routes](https://docs.astro.build/en/core-concepts/routing/#dynamic-routes) like `[...slug]` or `src/pages/[lang]/[version]/info.astro` generated by `getStaticPaths()`.

This integration cannot generate sitemap entries for dynamic routes in [SSR mode](https://docs.astro.build/en/guides/server-side-rendering/).

## Installation

### Quick Install

The `astro add` command-line tool automates the installation for you. Run one of the following commands in a new terminal window. (If you aren't sure which package manager you're using, run the first command.) Then, follow the prompts, and type "y" in the terminal (meaning "yes") for each one.

```sh
# Using NPM
npx astro add sitemap
# Using Yarn
yarn astro add sitemap
# Using PNPM
pnpm astro add sitemap
```

If you run into any issues, [feel free to report them to us on GitHub](https://github.com/withastro/astro/issues) and try the manual installation steps below.

### Manual Install

First, install the `@astrojs/sitemap` package using your package manager. If you're using npm or aren't sure, run this in the terminal:

```sh
npm install @astrojs/sitemap
```

Then, apply this integration to your `astro.config.*` file using the `integrations` property:

```diff lang="js" "sitemap()"
  // astro.config.mjs
  import { defineConfig } from 'astro/config';
+ import sitemap from '@astrojs/sitemap';

  export default defineConfig({
    // ...
    integrations: [sitemap()],
    //             ^^^^^^^^^
  });
```

## Usage

`@astrojs/sitemap` requires a deployment / site URL for generation. Add your site's URL under your `astro.config.*` using the `site` property. This must begin with `http:` or `https:`.

```js
// astro.config.mjs
import { defineConfig } from 'astro/config';
import sitemap from '@astrojs/sitemap';

export default defineConfig({
  // ...
  site: 'https://stargazers.club',
  integrations: [sitemap()],
});
```

Note that unlike other configuration options, `site` is set in the root `defineConfig` object, rather than inside the `sitemap()` call.

Now, [build your site for production](https://docs.astro.build/en/reference/cli-reference/#astro-build) via the `astro build` command. You will find both `sitemap-index.xml` and `sitemap-0.xml` in the `dist/` folder (or your custom [output directory](https://docs.astro.build/en/reference/configuration-reference/#outdir) if set).

> **Warning**
> If you forget to add a `site`, you'll get a friendly warning when you build, and the `sitemap-index.xml` file won't be generated.

After verifying that the sitemaps are built, you can add them to your site's `<head>` and the `robots.txt` file for crawlers to pick up.

```diff lang="html"
  <!-- src/layouts/Layout.astro -->
  <head>
+   <link rel="sitemap" href="/sitemap-index.xml" />
  </head>
```

```diff
  # public/robots.txt
  User-agent: *
  Allow: /

+ Sitemap: https://<YOUR SITE>/sitemap-index.xml
```

### Example of generated files for a two-page website

```xml title="sitemap-index.xml"
<?xml version="1.0" encoding="UTF-8"?>
  <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://stargazers.club/sitemap-0.xml</loc>
  </sitemap>
</sitemapindex>
```

```xml title="sitemap-0.xml"
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
  <url>
    <loc>https://stargazers.club/</loc>
  </url>
  <url>
    <loc>https://stargazers.club/second-page/</loc>
  </url>
</urlset>
```

## Configuration

To configure this integration, pass an object to the `sitemap()` function call in `astro.config.mjs`.

```js
// astro.config.mjs
import { defineConfig } from 'astro/config';
import sitemap from '@astrojs/sitemap';

export default defineConfig({
  integrations: [
    sitemap({
      // configuration options
    }),
  ],
});
```

### filter

All pages are included in your sitemap by default. By adding a custom `filter` function, you can filter included pages by URL.

```js
// astro.config.mjs
// ...
sitemap({
  filter: (page) => page !== 'https://stargazers.club/secret-vip-lounge/',
});
```

The function will be called for every page on your site. The `page` function parameter is the full URL of the page currently under considering, including your `site` domain. Return `true` to include the page in your sitemap, and `false` to leave it out.

To filter multiple pages, add arguments with target URLs.

```js
// astro.config.mjs
// ...
sitemap({
  filter: (page) =>
    page !== 'https://stargazers.club/secret-vip-lounge-1/' &&
    page !== 'https://stargazers.club/secret-vip-lounge-2/' &&
    page !== 'https://stargazers.club/secret-vip-lounge-3/' &&
    page !== 'https://stargazers.club/secret-vip-lounge-4/',
});
```

### customPages

In some cases, a page might be part of your deployed site but not part of your Astro project. If you'd like to include a page in your sitemap that _isn't_ created by Astro, you can use this option.

```js
// astro.config.mjs
// ...
sitemap({
  customPages: ['https://stargazers.club/external-page', 'https://stargazers.club/external-page2'],
});
```

### entryLimit

The maximum number entries per sitemap file. The default value is 45000. A sitemap index and multiple sitemaps are created if you have more entries. See this [explanation of splitting up a large sitemap](https://developers.google.com/search/docs/advanced/sitemaps/large-sitemaps).

```js
// astro.config.mjs
import { defineConfig } from 'astro/config';
import sitemap from '@astrojs/sitemap';

export default defineConfig({
  site: 'https://stargazers.club',
  integrations: [
    sitemap({
      entryLimit: 10000,
    }),
  ],
});
```

### changefreq, lastmod, and priority

These options correspond to the `<changefreq>`, `<lastmod>`, and `<priority>` tags in the [Sitemap XML specification.](https://www.sitemaps.org/protocol.html)

Note that `changefreq` and `priority` are ignored by Google.

> **Note**
> Due to limitations of Astro's [Integration API](https://docs.astro.build/en/reference/integrations-reference/), this integration can't analyze a given page's source code. This configuration option can set `changefreq`, `lastmod` and `priority` on a _site-wide_ basis; see the next option **serialize** for how you can set these values on a per-page basis.

```js
// astro.config.mjs
import { defineConfig } from 'astro/config';
import sitemap from '@astrojs/sitemap';

export default defineConfig({
  site: 'https://stargazers.club',
  integrations: [
    sitemap({
      changefreq: 'weekly',
      priority: 0.7,
      lastmod: new Date('2022-02-24'),
    }),
  ],
});
```

### serialize

A function called for each sitemap entry just before writing to a disk. This function can be asynchronous.

It receives as its parameter a `SitemapItem` object that can have these properties:

- `url` (absolute page URL). This is the only property that is guaranteed to be on `SitemapItem`.
- `changefreq`
- `lastmod` (ISO formatted date, `String` type)
- `priority`
- `links`.

This `links` property contains a `LinkItem` list of alternate pages including a parent page.

The `LinkItem` type has two fields: `url` (the fully-qualified URL for the version of this page for the specified language) and `lang` (a supported language code targeted by this version of the page).

The `serialize` function should return `SitemapItem`, touched or not.

The example below shows the ability to add sitemap specific properties individually.

```js
// astro.config.mjs
import { defineConfig } from 'astro/config';
import sitemap from '@astrojs/sitemap';

export default defineConfig({
  site: 'https://stargazers.club',
  integrations: [
    sitemap({
      serialize(item) {
        if (/exclude-from-sitemap/.test(item.url)) {
          return undefined;
        }
        if (/your-special-page/.test(item.url)) {
          item.changefreq = 'daily';
          item.lastmod = new Date();
          item.priority = 0.9;
        }
        return item;
      },
    }),
  ],
});
```

### i18n

To localize a sitemap, pass an object to this `i18n` option.

This object has two required properties:

- `defaultLocale`: `String`. Its value must exist as one of `locales` keys.
- `locales`: `Record<String, String>`, key/value - pairs. The key is used to look for a locale part in a page path. The value is a language attribute, only English alphabet and hyphen allowed.

[Read more about language attributes](https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/lang).

[Read more about localization](https://developers.google.com/search/docs/advanced/crawling/localized-versions#all-method-guidelines).

```js
// astro.config.mjs
import { defineConfig } from 'astro/config';
import sitemap from '@astrojs/sitemap';

export default defineConfig({
  site: 'https://stargazers.club',
  integrations: [
    sitemap({
      i18n: {
        defaultLocale: 'en', // All urls that don't contain `es` or `fr` after `https://stargazers.club/` will be treated as default locale, i.e. `en`
        locales: {
          en: 'en-US', // The `defaultLocale` value must present in `locales` keys
          es: 'es-ES',
          fr: 'fr-CA',
        },
      },
    }),
  ],
});
```

The resulting sitemap looks like this:

```xml
...
  <url>
    <loc>https://stargazers.club/</loc>
    <xhtml:link rel="alternate" hreflang="en-US" href="https://stargazers.club/"/>
    <xhtml:link rel="alternate" hreflang="es-ES" href="https://stargazers.club/es/"/>
    <xhtml:link rel="alternate" hreflang="fr-CA" href="https://stargazers.club/fr/"/>
  </url>
  <url>
    <loc>https://stargazers.club/es/</loc>
    <xhtml:link rel="alternate" hreflang="en-US" href="https://stargazers.club/"/>
    <xhtml:link rel="alternate" hreflang="es-ES" href="https://stargazers.club/es/"/>
    <xhtml:link rel="alternate" hreflang="fr-CA" href="https://stargazers.club/fr/"/>
  </url>
  <url>
    <loc>https://stargazers.club/fr/</loc>
    <xhtml:link rel="alternate" hreflang="en-US" href="https://stargazers.club/"/>
    <xhtml:link rel="alternate" hreflang="es-ES" href="https://stargazers.club/es/"/>
    <xhtml:link rel="alternate" hreflang="fr-CA" href="https://stargazers.club/fr/"/>
  </url>
  <url>
    <loc>https://stargazers.club/es/second-page/</loc>
    <xhtml:link rel="alternate" hreflang="es-ES" href="https://stargazers.club/es/second-page/"/>
    <xhtml:link rel="alternate" hreflang="fr-CA" href="https://stargazers.club/fr/second-page/"/>
    <xhtml:link rel="alternate" hreflang="en-US" href="https://stargazers.club/second-page/"/>
  </url>
...
```

## Examples

- The official Astro website uses Astro Sitemap to generate [its sitemap](https://astro.build/sitemap-index.xml).
- [Browse projects with Astro Sitemap on GitHub](https://github.com/search?q=%22%40astrojs%2Fsitemap%22+path%3Apackage.json&type=Code) for more examples!

## Troubleshooting

For help, check out the `#support` channel on [Discord](https://astro.build/chat). Our friendly Support Squad members are here to help!

You can also check our [Astro Integration Documentation][astro-integration] for more on integrations.

## Contributing

This package is maintained by Astro's Core team. You're welcome to submit an issue or PR!

## Changelog

See [CHANGELOG.md](CHANGELOG.md) for a history of changes to this integration.

[astro-integration]: https://docs.astro.build/en/guides/integrations-guide/