Import your content with Feeds with Drupal

Thursday, December 20, 2018

Feeds is a Drupal module designed for content import. Content in a broad sense, because using Feeds you can create not only nodes but also users, taxonomy terms and other entities.

By comparison with other migration modules as Migrate, which is included in Drupal 8's core, these are the features that make Feeds special:

All imports are created from Drupal's UI.
It provides different ways to provide the source data for imports: file upload, URL, or local directory.
All imports can be done from the website without any configuration change.
Allows to schedule imports.
Not restricted to just one data source.

In other words, it is a simpler module than Migrate, or at least easier to handle, as everything is done via interface without the need to write code. This makes it generally advisable to use Feeds when:

Content import will be done by a non-technical person.
Several imports can be made from different files, URLs, etc.
All the content is ready to import without any additional processing.

For example, imagine that our website shows information about the results of a sports competition. For example, petanque in all public parks in a city. Let's say someone collects the result of the games and saves it in a CSV file. Feeds would periodically process that CSV, adding to our website the new games that have occurred, but it could also update previous games that have been modified (by an initial transcription error, because a player claim has been addressed, etc).

It could also be useful if we need to import public data from several websites to a new website. With Feeds we could configure the import of data from these different sources to our website, with the advantage that if you modify the data in the original websites those changes will be transferred to our website.

Let's see how it's done.

Import data with Feeds

Installation

The first step is to install the module. You can do this using Composer:

cd /path/to/project
composer require drupal/feeds

There's no additional action needed.

General overview

The idea of Feeds is this: first, we must create an import configuration, Feed Type, which is the general configuration of how to import data, from what type of source, and where to put it. Once we have our import configuration, we must create a Feed itself, which is what really makes the import, using the Feed Type that we have previously created.

This scheme allows us to use the same configuration for different sources, so we configure it once and use it in several imports. In the previous example of petanque, let's imagine that the petanque league extends to more cities. One person per city would create the CSV and save it in a different folder so that each person works on their own file without interfering. We could reuse the same configuration in different imports, one for each folder with the data of each city. In the end, all the data would be imported into the same content (for example, a match content).

To reflect this separation we can see where Feeds puts the Feed Types and Feeds: the Feed Types are under the structure section, next to the content types, taxonomies, views or types of comments. The Feeds are under the content section because after all, they are a potential content.

Create an import configuration (Feed Type)

This is creating a FeedType, in Feeds jargon. It is the general configuration of an import. Here we can specify the origin of the data (if it is a URL, a file, etc), the content to which they are going to be imported (e.g. news type content) and other configurations.

Imports are created under Structure > Feeds Types > Add Feed Type (/admin/structure/feeds/add). On this page we will be able to see the following:

The first thing you need to provide is a name and a description of the importer, data that is simply a way of being able to easily know the importer's use: data for humans that has no impact on the importer's execution.

Then comes the importer options that configure its behaviour. Let's review them:

Fetcher

With this option, we can choose which will be the way to fetch de data when the import is made. These are the different options:

Download: Import by URL provided by the user.
Upload: Content will be imported by a manual file upload.
Directory: Selecting this option files will be found in a system folder from our Drupal file system public, private, or anyone available.

After selecting the Fetcher we could configure the options of the specific option in the tab "Fetcher settings".

Parser

This is the parser for the source data. It indicates the format of the file with the data to be imported. Among the available formats, we have RSS/ATOM, CSV and XML.

After having chosen a source format we can configure it in the "Parser settings" tab. For example, for CSV files we can indicate the separator character (comma, colon, semicolon, etc).

Processor

The processor is responsible for creating the appropriate content in Drupal with the data received. Among them, we have nodes (the most common), users, taxonomy terms and a long etcetera. In other words, a node type processor will create nodes with the data to be imported.

The processor is configured in "Processor settings". Of this configuration the most important options are:

Update existing contents: Allows you to choose the behaviour when importing previously imported content. We can choose to do nothing, replace the content (i.e., delete the current content and create a new one), or update the existing content (the content is not deleted and recreated, only edited). Normally the last option will be the one chosen, so we keep the content 'alive', updating it in case it changes in the data source. Being the same internal entity in Drupal any relationship that exists from other contents to this one is maintained. In the case of deleting and creating a new content, for example, we would lose any relationships.
Previously imported items: Allows you to choose what action to take if a content was created in previous imports that is not present in subsequent imports. There is a multitude of options here, from preserving the content to deleting it, unpublishing it, etc. By means of this option, we control situations in which an already imported content disappears from the data source. Should we delete it? Just unpublish it? Leave it available? It is also possible to carry out apparently strange actions, such as publishing it in case the content is not in the feed of the last import. Strange? Yes, but it can be a way to manage content publishing. It's just another example of the flexibility of Feeds.
Author: Allow choose the author that will be set in the created content. The author is selected from the Drupal existing users.

Settings

We have a configuration tab that at the moment only allows us to select how often an import will be made. This is useful if our import is based on URLs or directories as these may change. In this way, we can establish a periodical import, which is the most common in Feeds, although it is not mandatory.

Once created the imports will all be accessible in Structure > Feeds Types (/admin/structure/feeds).

Add fields to the feed type

Now we can start to map the fields from our data source with the fields of the content that we will import. Here is an example about add fields from a CSV file.

To do it, we have to go to the "Mapping" tab. We can start to add the fields.

The first thing we will have to do is select the first field that we want to add using "Select a target" field. Let's add a Title:

The first thing we will have to do is to select the first field we want to add (for example, the "Title") using the "Select a target" selector. In other words, we configure where to get the data that will go in the Drupal content field called "Title".

Select where these fields come from, using the "Select a source" selector. When displaying the source selector we will see several options. Most are fixed and they are not provided by the input file. Those kind of options are useful when you want the created contents to have the same value for a given field. But this is not usual, usually you select a field from the input file.

The names of the data source file fields (for example, the column of a CSV file) will not be in this list, so we cannot select them directly. This at first seems confusing, because we would expect to see in this selector the fields of the source file. However, this is not the case (perhaps because Feeds would have to preprocess the file in question, which would make it less flexible). Therefore, to add a field from our data source we have to choose "New CSV source" (if it is another source it will be "New XML source", for example). This will show us a new text box where we will add the name of the field relative to our data source. For example, for CSV files it will be the name of the column (according to its headers).

This is the result:

We can see some additional options on the right.

Unique: Indicates whether the value of that field is unique for each content or not. This helps Feeds to determine if an imported content is the same as another content already in Drupal. This will lead to the decision chosen in the import configuration seen above, in the "Update existing contents" field. Therefore, normally, this will be applied to a field that univocally identifies a content to be imported. If no field is marked as unique, Feeds will not be able to detect that a content imported again is already imported and duplicates will be created.
Configure: Depending on the type of field where the data will be inserted, we may have different options. For example, when the field is a taxonomy term there's an option that allows us to create it automatically if it does not exist. If it is a field of type summary and body of text we will be able to choose the text format. And so on according to the options of each field.
Summary: Displays the options chosen in the field configuration. It's just informative, so you don't have to open the field configuration form to see how it's configured.
Delete: It deletes the mapping.

Create an import

We already have the configuration of the import, now what we have to do is create the import itself using the Feed type we have created. To do this we have to go to Content > feeds (admin/content/feeds). Here we should press "Add feed".

In this display, we will have to add a title and select the data source.

In the image above we can see that it is necessary to upload a file since it an import by file upload. If we have chosen another way to get the file we will see other options (for example, a URL where the method is "download").

Now we can press the "Save and import" button. Depending on our import configuration (Feed Type) this will start immediately or it will be done periodically.

Once the import is created we will be able to edit it, launch the import, delete it, delete the items created through the import, etc.

Developer tips

When we make an import the provided Feed tools could be enough but sometimes we may need extra functionality.

As a solution Feeds allows to extend the functionality through the creation of modules, where we can create the following types of plugin:

FeedsTarget: allows you to create field target, being able to put its value wherever you need it. Feeds includes targets for all the fields of the core, so if you use a field provided by a module or your own field you will have to create a plugin of this type so that Feeds knows how to handle the field.
FeedsProcessor: allows you to create a Feeds processor to create the content you need. That is, if you need to create entities not provided by the core you will need to create a specific processor.
FeedsSource: allows you to create data source fields. For example, we could create a data source for a field that is the current time, so that Feeds would insert the current time in the field to which this data source was mapped.

As you can see Feeds is a good tool for importing content into our Drupal, especially when it comes from different sources and is imported periodically. It is simple to set up and flexible enough not to need additional code.

Feeds