Overview | On-Premise | Urbi Documentation

On-Premise overview

2GIS On-Premise is a set of services that allow you to deploy 2GIS products on your own sites.

Compared to using 2GIS's own cloud infrastructure, 2GIS On-Premise has the following advantages:

  • Enhanced data security: tracking and control of requests, ability to limit traffic to your local network.
  • Flexible access management: create access keys according to your internal policy.
  • Full control over the infrastructure: ability to quickly scale up or down depending on your needs.

On-Premise services are distributed as Docker images, and designed to operate in a Kubernetes cluster. Such design provides the following advantages:

  • Faster and more cost-effective service update process due to containerization. This allows to avoid setting up all dependencies with each deployment iteration.
  • Simple horizontal scaling of services that helps to:
    • Ensure high availability of the services. Any service can serve requests even with part of its instances failed. It is achieved by combination of replication and load balancing with Ingress.
    • Increase services' performance. Due to caching in various application layers, there is a significant reduction in the request processing time.

One or more services can implement several 2GIS products (see the table below).

2GIS Product On-Premise Service
Maps
MapGL JS API MapGL JS API
Tiles API
Search
Places API
Geocoder API
Suggest API
Categories API
Regions API
Catalog API
Search API
Navigation
Directions API
Distance Matrix API
Truck Direction API
Pairs Direction API
Map Matching API
Isochrone API
Public Transport API
TSP API
Navi-Castle
Navi-Front
Navi-Router
Navi-Back
Distance Matrix Async API
GIS Platform
GIS Platform GIS Platform

Downloading the Docker images of On-Premise services requires an access key. To obtain the key, fill out the form at dev.2gis.com/order.

On-Premise services are not self-contained and share common infrastructure that enables services to operate. A significant part of services and infrastructure can be deployed in isolated private network with limited or no access to the internet (it depends on the service, see the Deployment considerations section for details).

Architecture and infrastructure

The shared infrastructure comprises:

  • Data storage for services. Data storage includes the following data services:

    • Apache Cassandra
    • Apache Kafka
    • PostgreSQL
    • Redis
    • Kubernetes Persistent Volumes that can be claimed via dynamic Persistent Volume Claim.

    Different services use different combinations of data storages. See the Deployment requirements section for details.

  • Update/delivery infrastructure:

    • 2GIS CLI application to fetch the up-to-date deployment artifacts (Docker images and datasets).
    • Deployment Artifact Storage to store the fetched data. Any S3-compatible storage is suitable. Support for other types of storage is under active development.
    • Docker Registry to store the fetched Docker images.
    • Kubernetes Importer job to upload the fetched data to the data storage.
    • Helm application to deploy and update the services.

    See the Update and data management processes section for details.

  • API Keys service to manage access to the APIs of the deployed On-Premise services.

    This service requires Apache Kafka, PostgreSQL, and Redis data storage services, as well as pre-deployed LDAP servers for authenticating users.

  • LDAP server. The server is required to authenticate users that use API Keys Admin service. See the On-Premise API Keys service document for details.

  • Kubernetes Ingress controller for load balancing the requests to instances of a service. Almost every service is placed behind a load balancer to achieve the requirements for scalability and high availability.

These On-Premise services may require an access to real-time traffic data in certain use cases:

  • Maps service: MapGL JS API service uses the data to plot colored traffic status on a map.
  • Navigation service: Navi-Back service uses the data to build the route considering current traffic conditions.
  • GIS Platform service: GIS Platform uses the data to plot colored traffic status on a map's overlay.

The On-Premise services architecture implies the services can operate without internet access. However, to fetch the real-time traffic data, access to the public 2GIS servers is required.

To overcome this limitation, the aforementioned services use a dedicated Traffic Proxy service. This way, the limited outbound access to the internet is to be granted solely to the proxy service, while leaving other services that a certain On-Premise service requires to operate with no internet access.

Traffic Proxy service can also be used to get map tiles with traffic information in TMS format.

API keys usage

The API Keys service is used to:

  • Issue and revoke the regular API keys for end users. These keys grant access to a certain set of deployed On-Premise services.
  • Gather metrics and statistics from deployed On-Premise services to enforce access policies based on usage metrics for a given API key. To do that, the services should have a special service API key configured.

See the On-Premise API Keys service document for details about how to get the service API keys.

Update and data management processes

To keep the On-Premise services and their data up to date, the following update/delivery mechanism is used:

  1. 2GIS CLI:

    1. Downloads the deployment artifacts from 2GIS Public Update Servers.

    2. Places the Docker images to Docker Registry, and data files to Deployment Artifacts Storage.

      Note:

      For other options of placing deployment artifacts, see 2GIS CLI.

  2. All artifacts then migrate from public to private network, so that they become available to Helm and On-Premise services. The migration process can be implemented in many different ways depending on the specifics of the project. See the Deployment considerations section for details.

  3. Deployment Artifacts Storage is used by many On-Premise services either when doing the initial service deployment or when updating the existing deployment.

    Many services that rely on the storage have a dedicated Kubernetes Importer job that manages data lifecycle for the service:

    1. The job reads a manifest file in the Deployment Artifacts Storage, then determines if there is a new piece of data.

      This file serves as a simple database of objects and versions contained in the Deployment Artifacts storage.

    2. When the job finds updated data in the Deployment Artifacts Storage, it spawns some workers.

    3. The worker fetches the necessary deployment artifacts and imports the new data to the service's data storage as a separate copy.

    4. After the workers complete the data import, the job performs a series of health checks to ensure the integrity of the new data:

      1. When all checks are completed successfully, the job removes the original data, resulting in the new data replacing the original data.
      2. If one or more checks fails, the job stops the updating process requiring actions from the system administrator. The original data (if any) is left intact.
  4. Common update scenarios are listed below.

    Important note:

    The update process may vary from service to service. Consult the documentation of the individual service for details.

    It is possible to:

    1. Update a service with Helm.

      Helm updates the service a similar way as the Kubernetes job updates the data: new instances of the services will be deployed next to the current ones. If health checks are completed successfully, traffic is redirected to the new set of services. Otherwise, the process stops, requiring actions from the system administrator.

    2. Update a service and its data with Helm, if supported by the service.

      Helm will launch the service's Kubernetes Importer job to update the data, then Helm will update the service.

    3. Update a service's data only, if supported by the service. The corresponding Kubernetes Importer job is scheduled to run, for example, on an everyday basis.

Take the following information into account when preparing for deployment:

  1. Only a part of services and update/delivery infrastructure has to be deployed in a public network with internet access. A data migration procedure should be developed so that services and infrastructure that resides in the private subnet have an access to the deployment artifacts in the public network.

    For example, to deliver the deployment artifacts from public network to private network, you can deploy a private Docker Registry alongside a private S3-compatible storage, and setup a synchronization job between them and their public copies.

    Example data migration process
  2. Helm charts that are used to deploy On-Premise services deploy only these services. All the necessary shared infrastructure should be prepared prior to running the Helm charts. After that, you will need to specify the necessary information in YAML config files and run Helm.

  3. Most of the services should be placed behind a load balancer. When deploying via Helm chart, an Ingress resource is created for the necessary services. You can use an Ingress controller of your choice to implement the Ingress load balancer in your Kubernetes cluster (see the Service requirements section for details).

  4. Deployment Artifacts Storage that is used by 2GIS CLI requires regular maintenance to clear out the outdated deployment artifacts. This helps to prevent overflow of the storage space.

    Note:

    2GIS CLI does not track and does not manage free space in Deployment Artifacts Storage or Docker Registry. It is recommended to setup monitoring for these parts of infrastructure and perform regular maintenance.

  5. All On-Premise services, except the Traffic Proxy service, require no internet access to operate and therefore could be deployed in isolated private networks.

    Make sure to configure the private network's infrastructure to provide an internet access to the service if you plan to use it.

  6. Do not confuse the access key for 2GIS CLI with regular API keys that are managed by the API Keys service, or service API keys.

    The API Keys service allows to assign regular API keys for users of your On-Premise deployment, so that an access control is enforced. Also, this service uses service API keys to communicate with the On-Premise services that require API keys from end user to operate.

    2GIS CLI uses a dedicated key to fetch deployment artifacts associated with purchased 2GIS products.

  7. 2GIS ships versioned On-Premise services. Each On-Premise solution's version comprises a set of On-Premise services that are specific to that version.

    The On-Premise version can be specified when:

    • Installing a service.
    • Updating a service.
    • Downloading the deployment artifacts via 2GIS CLI utility.

    If the version is not specified, then the latest available version number will be used during the aforementioned operations.

    See release notes for the list of available versions.

The following infrastructure should be prepared prior to doing any On-Premise solution deployment:

  1. A Kubernetes cluster.
  2. An Ingress controller of your choice. For example, NGINX Ingress Controller.
  3. A dedicated server to host the public services of the update/delivery infrastructure.
  4. Deployment Artifacts Storage, both public and private instances.
  5. Docker Registry, both public and private instances.
  6. An LDAP server.
  7. All Data Storage services.

See the Architecture of the on-premise solution and the Deployment considerations sections for details.

These requirements must be satisfied both for testing and production environments:

Common software:

  1. Operating system: Ubuntu 20.04 LTS
  2. Kubernetes: 1.19
  3. Docker: 19.03.4
  4. Docker Registry: 2.7.1

Data Storage services:

  1. Apache Cassandra: 3.11
  2. Apache Kafka: 2.7.0 with Apache ZooKeeper 3.4.13
  3. PostgreSQL: 11 with extensions PostGIS 2.5, JsQuery
  4. Redis: 6.2 (stable release)
  5. S3-compatible storage: for example, Ceph S3 v.14.2.22

Note:

Some On-Premise services have slightly different requirements for software versions. See the documentation of a specific service for details.

Testing environment

Note:

The configurations listed below are sample configurations. Contact 2GIS when planning any deployment to get calculations adapted to your environments and needs.

Service CPU RAM
GB
Min
replicas
Network
requirements
Storage
requirements
Load
Balancer
Internet
Access
Type Size
Update/delivery infrastructure
2GIS CLI No Yes S3 storage See below
Docker Registry See below
S3-compatible storage No No Own storage 800GB
Docker Registry No No Own storage 100GB
Traffic Proxy service
NGINX reverse proxy 1 2 2 Yes Yes
API Keys service
Frontend 1 1 2 Yes No
Backend 1 1 2 Yes No Apache Kafka See below
PostgreSQL See below
Redis See below
Apache Kafka (+ Zookeeper) 4 4 3 Yes No Own storage 500GB*
PostgreSQL 2 4 3 Yes No Own storage 200GB
Redis Requirements will be specified later.
Maps
MapGL JS API 1 2 2 Yes No
Tiles API Backend 1 0.5 2 Yes No Apache Cassandra See below
Apache Cassandra 1 16 3 Yes No Own storage 500GB
Kubernetes Importer job 1 4 1 No No
Navigation
Navi-Front Requirements will be specified later.
Navi-Router Requirements will be specified later.
Navi-Back Requirements will be specified later.
Navi-Castle Requirements will be specified later. K8S Persistent Volume See below
Distance Matrix Async API Requirements will be specified later. Apache Kafka See below
PostgreSQL See below
S3 storage See below
K8S Persistent Volume** Own storage 5GB for each Navi-Castle replica
GIS Platform
Portal frontend 2 1 2 Yes No
SPCore backend 8 4 2 Yes No PostgreSQL See below
S3-compatible storage See below
PostgreSQL 4 2 3 Yes No Own storage 100GB
S3-compatible storage Yes No Own storage 4TB***
ZooKeeper 2 2 2 Yes No

* Note that these storage requirements may vary depending on the configured statistics storage time period. The greater this period is, the more storage space is required.

** This storage requirement is optional. However, it is highly recommended to configure Persistent Volume and Persistent Volume Claim storage features in your Kubernetes cluster.

*** Note that these storage requirements are calculated for the use case of storing huge volume of high-resolution tiled images, for example, satellite imagery. If you do not plan to store such type of data, then storage requirements may be lowered.

Production environment

Note:

The configurations listed below are sample configurations. Contact 2GIS when planning any deployment to get calculations adapted to your environments and needs.

Service CPU RAM
GB
Min
replicas
Network
requirements
Storage
requirements
Load
Balancer
Internet
Access
Type Size
Update/delivery infrastructure
2GIS CLI No Yes S3 storage See below
Docker Registry See below
S3-compatible storage No No Own storage 800GB
Docker Registry No No Own storage 100GB
Traffic Proxy service
NGINX reverse proxy 2 2 2 Yes Yes
API Keys service
Frontend 1 1 2 Yes No
Backend 1 1 2 Yes No Apache Kafka See below
PostgreSQL See below
Redis See below
Apache Kafka (+ Zookeeper) 8 12 3 Yes No Own storage 500GB*
PostgreSQL 2 4 3 Yes No Own storage 200GB
Redis Requirements will be specified later.
Maps
MapGL JS API 2 2 2 Yes No
Tiles API Backend 4 0.5 2 Yes No Apache Cassandra See below
Apache Cassandra 4 16 3 Yes No Own storage 500GB
Kubernetes Importer job 4 4 1 No No
Navigation
Navi-Front Requirements will be specified later.
Navi-Router Requirements will be specified later.
Navi-Back Requirements will be specified later.
Navi-Castle Requirements will be specified later. K8S Persistent Volume See below
Distance Matrix Async API Requirements will be specified later. Apache Kafka See below
PostgreSQL See below
S3 storage See below
K8S Persistent Volume** Own storage 5GB for each Navi-Castle replica
GIS Platform
Portal frontend 2 1 2 Yes No
SPCore backend 8 4 2 Yes No PostgreSQL See below
S3-compatible storage See below
PostgreSQL 16 32 2 Yes No Own storage 100GB
S3-compatible storage Yes No Own storage 4TB***
ZooKeeper 4 8 2 Yes No

* Note that these storage requirements may vary depending on the configured statistics storage time period. The greater this period is, the more storage space is required.

** This storage requirement is optional. However, it is highly recommended to configure Persistent Volume and Persistent Volume Claim storage features in your Kubernetes cluster.

*** Note that these storage requirements are calculated for the use case of storing huge volume of high-resolution tiled images, for example, satellite imagery. If you do not plan to store such type of data, then storage requirements may be lowered.

Prior to doing any deployment, do the following:

  • Check that your infrastructure satisfies the requirements listed in the Deployment requirements section.
  • Check the operability of:
    • Kubernetes cluster with Helm installed
    • Data storage services
    • Docker CLI and 2GIS CLI
    • Deployment Artifacts Storage and Docker Registry
  • Check that the necessary public infrastructure is accessible from private network and vice versa
  • Check that you have the access key for 2GIS CLI

These steps are common for every On-Premise service deployment:

  1. Examine the checklist.

  2. Fetch the up-to-date deployment artifacts from 2GIS servers using 2GIS CLI:

    Important note:

    Write down the path to the manifest file, as this information will be needed later. The path will be displayed in the standard output stream (stdout) when 2GIS CLI finishes working.

  3. Deliver the deployment artifacts from public network to private network. See the Deployment considerations section for details.

  4. Add the repository with 2GIS Helm charts:

    helm repo add 2gis-on-premise https://2gis.github.io/on-premise-helm-charts && \
    helm repo update
    

    Execute the following command to check the availability of the repository:

    helm search repo 2gis-on-premise
    

    If the command's output contains a non-empty list of charts, then the repository is successfully added.

  5. Deploy API Keys service. Check its availability via a web browser.

  6. Deploy the required On-Premise service, using the information from this section.

    See the documentation for the specific service for details: