Read Online
See GitHub Repository
Selling products online is on the rise, however there is a significant entry-barrier for small and medium sized businesses to enter the market. The main reason for this is the high cost of setting up and maintaining an online store. This is where an e-commerce marketplace comes in. An e-commerce marketplace is a platform that allows vendors to list their products and buyers to purchase them, taking a commission on each sale. This allows vendors to focus on their core business and buyers to have a wide variety of products to choose from while delegating the task of developing and maintaining the platform where the actual sales and logistics take place. The marketplace also provides a platform for vendors to advertise their products and buyers to find them and leave reviews. The information system I propose and design in this project aims to realise such an E-Commerce platform.
In what follows, I will describe the logical architecture of the system using textual and UML-diagrammatic representations.
To give a better impression of the system and its foreseen capabilities, I present a use case diagram in this section.
In the context of the proposed system, the below-described actors can be identified.
The specification of the diagrammed use cases are defined as follows.
The component diagram presented below shows the components of the system and how they play together to achieve the information system's goals.
The diagram below presents where the previously introduced components are planned to be deployed: locally on a client device, on an edge node or on a cloud node.
Taking a look at the functional and non-functional requirements of the system, it is straightforward that the combined usage of cloud and edge deployment techniques is justified. A public cloud deployment is sensible for the system's core functionality to achieve high availability, scalability, fault tolerance, and elasticity.
On the other hand, the usage of edge deployment to squeeze out the last millisecond of performance is justified by the nature of an e-commerce system and the behaviour of consumers: various studies - such as the one conducted by Google and Deloitte jointly - outline that "milliseconds make millions", that is, the performance of an e-commerce system is crucial to maximize conversion rates and revenue. Furthermore, targeted advertisements and personalized recommendations also need to be supported by the system. This requires the collection and processing of users' data based on which massively personalized content can be delivered dynamically. Since this dynamic content can be influenced by users' location and how they interact with the frontend in real-time (what mouse or touch events they perform, how much time they spend on a given page segment, etc.) it is crucial to have this processing as close as possible to the users. Complying with local data privacy regulations is also a concern, which is another reason to have the data processing as close as possible to the users, without having to transfer the data to a far away, potentially out-of-continent data center.
Furthermore, for the system to support search with image recognition & object detection (so that users can search and browse products using images or descriptions) the use of AI is justified. What's more, for the fraudelent activity detection, the combined use of edge computing and AI is also reasonable, so that fraudulent activities such as fake accounts, fake reviews, or fraudulent transactions can be detected in real-time, helping to protect the system and its users from scams, and ensure the integrity and security of the platform.
As the previous diagram illustrates, the E-Commerce Dashboard components are all deployed to the end-user client devices. In the case of a Customer, the Local Analytics AI component is also deployed onto the device, featuring federated learning. The platform collects browsing and purchasing data from its users and divides the data into multiple local datasets, one for each user. The system then trains a machine learning model using the local datasets. Because the data is kept private and secure, the model is trained in a decentralized way and does not have access to the raw data and finally it uses the trained model to make personalized recommendations to users based on their browsing and purchasing history. As users continue to browse and make purchases on the platform, their local datasets are updated and the model is retrained using federated learning to improve its accuracy and relevance.
The Order Management and Transaction Management components are deployed to edge nodes to provide absolutely mimimal latency and to ensure that the system is highly available in all areas. Furthermore, a Data Anonymization component is also deployed here out of privacy considerations and so that the system is complient with all local data protection laws and gets fully anonymized before being sent to the cloud. Finally, a Fraud Detection component is also deployed to the edge nodes to detect fraudulent activities in real-time.
In what follows I idenitfy concrete cloud patterns and justify their usage in the system.
The identification of vendor agnostic cloud patterns is a crucial step when designing a cloud-based information system. This makes it possible to reason about the system and its needs in a way such that we are not committing ourselves to any of the mainstream cloud providers, allowing us to prevent an early vendor lock-in and advocate for cloud portability. Below I identify the vendor agnostic cloud patterns that are applicable to the system.
Component(s) | Pattern | Description |
---|---|---|
System-level |
Public Cloud:
IT resources are provided as a service to a very large
customer group in order to enable elastic use of a static
resource pool. Given that this E-Commerce platform should be available and scalable globally, the use of a public cloud is reasonable. |
Component(s) | Pattern | Description |
---|---|---|
E-Commerce Dashboard |
Content Distribution Network:
IT resources with a peaking utilization at reoccurring time
intervals experience periodic workload. The storefront UI needs to serve several media assets such as product previews, review images, etc. Using a CDN can help provide a better experience for users, by loading these assets quicker. |
Component(s) | Pattern | Description |
---|---|---|
System-level |
Distributed Application:
A cloud application divides provided functionality among
multiple application components that can be scaled out
independently. This pattern allows to break down the system into smaller, independent components that can be developed, tested, and deployed separately. This can make it easier to manage the development and maintenance of the E-Commerce platform, and allow to make changes and updates to individual components without affecting the entire system. This pattern can also improve the scalability of the platform and allow it to handle increased traffic and demand without performance degradation. |
Component(s) | Pattern | Description |
---|---|---|
|
User Interface Component:
Customizable user interfaces are accessed by humans.
Application internal interaction is realized asynchronously to
ensure loose coupling. The platform is made accessible to the end-users via easy-to-use, graphical user interfaces. |
|
|
Data Access Component:
Access to data is handled by components that isolate
complexity, enable additional consistency, and ensure
adjustability of data elements. Several components in the system need to store state in order to provide persistency and a consistent user experience. A data access component can be made responsible for communicating with the cloud-based data store, which contains the data for the products, customers, and orders in the ecommerce platform. It can expose a set of methods that can be used by the rest of the application to perform operations on the data, such as reading a product, updating a customer's address, or deleting an order. It abstracts the details of the cloud-based data store and the data access and manipulation logic, allowing the application to scale and evolve as the business grows and changes. |
|
As above |
Stateful Component:
Multiple instances of a scaled-out application component
synchronize their internal state to provide a unified
behavior. Several components in the system need to store state in order to provide persistency and a consistent user experience. The stateful component is responsible for managing the state of the application, which includes data such as the products, customers, and orders in the ecommerce platform. |
Component(s) | Pattern | Description |
---|---|---|
|
Exactly-once Delivery:
The messaging system ensures that each message is delivered
exactly once by filtering possible message duplicates
automatically. After a customer buys a product, a workflow is triggered, notifying both the customer and the vendor about the event. Events in the order-, product-inventory-, and review management components also trigger such workflows. |
Component(s) | Pattern | Description |
---|---|---|
System-level |
Unpredictable Workload:
IT resources with a random and unforeseeable utilization over
time experience unpredictable workload. While most of the time the workload of such a platform can be predicted, there are edge cases which could lead to unusual usage peaks: if a vendor introduces a new product which is extremely hot, or reaches unusually high traffic through own marketing campaings, the platform's load can vary highly. |
Component(s) | Pattern | Description |
---|---|---|
System-level |
Elastic Load Balancer:
The number of synchronous accesses is used to adjust the
number of required application component instances. To serve users seamlessly, even when there is a peak in the number of visitors, new application instances might need to be provisioned automatically. A load balancer can assist in fulfilling this requirement. |
Component(s) | Pattern | Description |
---|---|---|
|
Key-Value Storage:
Semi-structured or unstructured data is stored with limited
querying support but high-performance, availability, and
flexibility. A key-value storage system is used in the platform because it provides fast, efficient access to data. Since data is stored as key-value pairs, it can be retrieved quickly by using the key as an index. This is useful as we need to access large amounts of data quickly and efficiently in order to provide a smooth and responsive user experience, especially in the order, product inventory, review and customer management modules so that related requests can be served quickly. |
|
System-level |
Relational Database:
Data is structured according to a schema that is enforced
during data manipulation and enables expressive queries of
handled data. A relational database system allows us to organize data in a structured, logical way, making it easier to manage and maintain the data. This is important for the proposed E-Commerce platform, as it is expected to have large amounts of data, such as products, customers, and orders, that need to be organized and managed efficiently. It also enforces data integrity and consistency, which can help ensure that the data in the system is accurate and reliable. |
While the identification of cloud agnostic patterns is a crucial step when it comes to designing a cloud native application, it is also important to identify vendor specific patterns that can be used to leverage the capabilities of a specific cloud provider and to actually bring the system into a production environment. In this section, I will identify vendor specific patterns that can be used to implement the cloud agnostic patterns identified in the previous section. Where applicable, I also specify whether an open source solution is used to implement the pattern to create a lower coupling between the application and the cloud provider.
Vendor Agnostic Pattern | Vendor Specific Pattern | Open Source Artifact(s) | Description |
---|---|---|---|
|
|
🚫 |
Public Cloud ➡️ Google Cloud |
|
|
🚫 |
Cloud Distribution Network ➡️ Cloud CDN |
|
|
🚫 |
Exactly-once Delivery ➡️ Pub/Sub |
|
|
|
Key-value Storage ➡️
Memorystore Redis can be deployed to GCP's Memorystore offering as an Open Source artifact. |
|
|
|
Relational Database ➡️
Cloud SQL PostgreSQL can be deployed to GCP's Cloud SQL offering as an Open Source artifact. |
|
|
|
User Interface Component ➡️
App Engine nginx and Next.js can be deployed to GCP's App Engine offering as Open Source artifacts. |
|
|
|
Elastic Load Balancer ➡️
Google Kubernetes Engine - Ingress Kubernetes and an nginx Ingress Controller can be deployed to GCP's Kubernetes Engine offering as Open Source artifacts. Please note that the Cloud Load Balancing offering of GCP would also be a viable choice here, however, since it would couple the E-Commerce platform more tightly to the cloud provider, the k8s-and nginx-based solution is preferred to enhance cloud portability. |
After having identified the patterns and artifacts that are required to build the E-Commerce platform, the next step is to define the architecture of the platform on GCP. The diagram below shows the architecture of the E-Commerce platform on Google Cloud Platform, adapted from the original logical architecture diagram.
In order to keep the OpenTOSCA models as overviewable as possible, I created a separate topology for the components deployed to the edge and for those deployed to the cloud. The topologies are shown below.
As apparent from the topology diagrams, I closely followed the previously introduced architecture of the E-Commerce platform. In order to represent every single component with the proper artifact type, I created custom node types for the following components:
The relationships used to connect these components are:
ConnectsTo
DependsOn
HostedOn
Please note that I also explicitly modeled the container runtime of the Kubernetes Engine as a Docker Runtime, so that child components could be represented as containerized artifacts. While modeling a Kubernetes setup could have involved specifying pods, deployments, services, ingresses, etc., I decided to keep the model as simple as possible and abstract away all the k8s-specific details to preserve the scope of this assignment.
It is important to mention that the OpenTosca diagrams involve various Google-Cloud-Platform-specific components, however, I specified open-source artifacts to be hosted on these vendor-specific components so that a reasonable level of cloud-portability is achieved. If we were to migrate the E-Commerce platform to another cloud provider, we would have to replace the Google Cloud Platform components with the corresponding components of the target cloud provider. In this case, most of the OpenTOSCA components would remain unchanged, since only the components that are specific to the previous cloud provider would have to be replaced.
The proposed Cloud Architecture of an E-Commerce Platform is designed to utilize cloud techniques to provide a scalable, highly available, and low-latency platform for online sales. The system is built using cloud and AI technologies to ensure that it can handle a large number of concurrent users and withstand spikes in traffic, incorporating edge deployments too. This allows the platform to provide a smooth and seamless experience for both vendors and customers. The identification of both vendor agnostic and vendor specific patterns, I was able to explore the design space of the E-Commerce platform and identify suitable implementation approaches, providing a solid foundation for the development of the platform without being tied to a specific vendor.
The below-listed resources were used to create this assignment, including technical documentation and graphical assets.