Code Dump: Contentstack .NET Static Site Exporter

This blog post describes a prototype .NET command line tool that exports a website to static HTML and JSON files in subdirectories corresponding to the URLs of the entries in the Contentstack SaaS headless CMS. You can use a solution based on this approach to spider any Contentstack content delivery website to HTML and JSON files, regardless of whether that solution uses ASP.NET.

The directory, connection credentials, and website are hard-coded. The code is a single linear procedure that pages through records of content types, then pages through entries for each content type, retrieves pages, creates directories, and writes HTML and JSON files. This solution should be faster with threading, for example creating a new thread for each content type, each page, and/or each entry. This code does not count content types or entries; it assumes that if the last page had 100 entries, there could be another page. It also halts at the first exception.

This assumes index.html as the default file name for the web server and moves and rename files accordingly. Rather than creating /parent/child.html and /parent/child/grandchild.html, it creates /parent/child/index.html and /parent/child/grandchild.html, unless /parent/child/grandchild/greatgrandchild is also a URL, in which case it creates /parent/child/grandchild/index.html and /parent/child/grandchild/greatgrandchild.html.

I developed this tool for a solution that uses ASP.NET to render pages from data in Contentstack. To support previewing without static file generation on every save event, the CMS previewing environment uses ASP.NET to retrieve data from the CMS and render pages dynamically. The command line tool spiders that site to static HTML files for deployment to content delivery environments such as production.

Deletion and URL affect file structures created by export tools. Generate files to a directory that does not already contain generated files and subdirectories to contain them.

You may also want to generate sitemap.xml and robots.txt:

See also:

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: