DZone Spotlight

Thursday, May 2 View All Articles »

Designing Communication Architectures With Microservices

By Swathi Prasad

Editor's Note: The following is an article written for and published in DZone's 2024 Trend Report, Modern API Management: Connecting Data-Driven Architectures Alongside AI, Automation, and Microservices. Microservices-based applications are distributed in nature, and each service can run on a different machine or in a container. However, splitting business logic into smaller units and deploying them in a distributed manner is just the first step. We then must understand the best way to make them communicate with each other. Microservices Communication Challenges Communication between microservices should be robust and efficient. When several small microservices are interacting to complete a single business scenario, it can be a challenge. Here are some of the main challenges arising from microservice-to-microservice communication. Resiliency There may be multiple instances of microservices, and an instance may fail due to several reasons — for example, it may crash or be overwhelmed with too many requests and thus unable to process requests. There are two design patterns that make communication between microservices more resilient: retry and circuit breakers. Retry In a microservices architecture, transient failures are unavoidable due to communication between multiple services within the application, especially on a cloud platform. These failures could occur due to various scenarios such as a momentary connection loss, response time-out, service unavailability, slow network connections, etc. (Shrivastava, Shrivastav 2022). Normally, these errors resolve by themselves by retrying the request either immediately or after a delay, depending on the type of error that occurred. The retry is carried out for a preconfigured number of times until it times out. However, a point of note is that the logical consistency of the operation must be maintained during the request to obtain repeatable responses and avoid potential side effects outside of our expectations. Circuit Breaker In a microservices architecture, as discussed in the previous section, failures can occur due to several reasons and are typically self-resolving. However, this may not always be the case since a situation of varying severity may arise where the errors take longer than estimated to be resolved or may not be resolved at all. The circuit breaker pattern, as the name implies, causes a break in a function operation when the errors reach a certain threshold. Usually, this break also triggers an alert that can be monitored. As opposed to the retry pattern, a circuit breaker prevents an operation that’s likely to result in failure from being performed. This prevents congestion due to failed requests and the escalation of failures downstream. The operation can be continued with the persisting error enabling the efficient use of computing resources. The error does not stall the completion of other operations that are using the same resource, which is inherently limited (Shrivastava, Shrivastav 2022). Distributed Tracing Modern-day microservices-architecture-based applications are made up of distributed systems that are exceedingly complex to design, and monitoring and debugging them becomes even more complicated. Due to the large number of microservices involved in an application that spans multiple development teams, systems, and infrastructures, even a single request involves a complex network of communication. While this complex distributed system enables a scalable, efficient, and reliable system, it also makes system observability more challenging to achieve, thereby creating issues with troubleshooting. Distributed tracing helps us overcome this observability challenge by using a request-centric view. As a request is processed by the components of a distributed system, distributed tracing captures the detailed execution of the request and its causally related actions across the system's components (Shkuro 2019). Load Balancing Load balancing is the method used to utilize resources optimally and to ensure smooth operational performance. In order to be efficient and scalable, more than one instance of a service is used, and the incoming requests are distributed across these instances for a smooth process flow. In Kubernetes, load balancing algorithms are implemented in a more effective manner using a service mesh, which is based on recorded metrics such as latency. Service meshes mainly manage the traffic between services on the network, ensuring that inter-service communications are safe and reliable by enabling the services to detect and communicate with each other. The use of a service mesh improves observability and aids in monitoring highly distributed systems. Security Each service must be secured individually, and the communication between services must be secure. In addition, there needs to be a centralized way to manage access controls and authentication across all services. One of the most popular ways for securing microservices is to use API gateways, which act as proxies between the clients and the microservices. API gateways can perform authentication and authorization checks, rate limiting, and traffic management. Service Versioning The deployment of a microservice version update often leads to unexpected issues and breaking errors between the new version of the microservice and other microservices in the system, or even external clients using that microservice. While the team deploying the new version attempts to mitigate and reduce these breaks, multiple versions of the same microservice can be run simultaneously, thereby allowing requests to be routed to the appropriate version of the microservice. This is done using API versioning for API contracts. Communication Patterns Communication between microservices can be designed by using two main patterns: synchronous and asynchronous. In Figure 1, we see a basic overview of these communication patterns along with their respective implementation styles and choices. Figure 1. Synchronous and asynchronous communication with common implementation technologies Synchronous Pattern Synchronous communication between microservices is one-to-one communication. The microservice that generates the request is blocked until a response is received from the other service. This is done using HTTP requests or gRPC — a high-performance remote procedure call (RPC) framework. In synchronous communication, the microservices are tightly coupled, which is advantageous for less distributed architectures where communication happens in real time, thereby reducing the complexity of debugging (Newman 2021). Figure 2. Synchronous communication depicting the request-response model The following table shows a comparison between technologies that are commonly used to implement the synchronous communication pattern. Table 1. REST vs. gRPC vs. GraphQL REST gRPC GraphQL Architectural principles Uses a stateless client-server architecture; relies on URIs and HTTP methods for a layered system with a uniform interface Uses the client-server method of remote procedure call; methods are directly called by the client and behave like local methods, although they are on the server side Uses client-driven architecture principles; relies on queries, mutations, and subscriptions via APIs to request, modify, and update data from/on the server HTTP methods POST, GET, PUT, DELETE Custom methods POST Payload data structure to send/receive data JSON- and XML-based payload Protocol Buffers-based serialized payloads JSON-based payloads Request/response caching Natively supported on client and server side Unsupported by default Supported but complex as all requests have a common endpoint Code generation Natively unsupported; requires third-party tools like Swagger Natively supported Natively unsupported; requires third-party tools like GraphQL code generator Asynchronous Pattern In asynchronous communication, as opposed to synchronous, the microservice that initiates the request is not blocked until the response is received. It can proceed with other processes without receiving a response from the microservice it sends the request to. In the case of a more complex distributed microservices architecture, where the services are not tightly coupled, asynchronous message-based communication is more advantageous as it improves scalability and enables continued background operations without affecting critical processes (Newman 2021). Figure 3. Asynchronous communication Event-Driven Communication The event-driven communication pattern leverages events to facilitate communication between microservices. Rather than sending a request, microservices generate events without any knowledge of the other microservices' intents. These events can then be used by other microservices as required. The event-driven pattern is asynchronous communication as the microservices listening to these events have their own processes to execute. The principle behind events is entirely different from the request-response model. The microservice emitting the event leaves the recipient fully responsible for handling the event, while the microservice itself has no idea about the consequences of the generated event. This approach enables loose coupling between microservices (Newman 2021). Figure 4. Producers emit events that some consumers subscribe to Common Data Communication through common data is asynchronous in nature and is achieved by having a microservice store data at a specific location where another microservice can then access that data. The data's location must be persistent storage, such as data lakes or data warehouses. Although common data is frequently used as a method of communication between microservices, it is often not considered a communication protocol because the coupling between microservices is not always observable when it is used. This communication style finds its best use case in situations that involve large volumes of data as a common data location prevents redundancy, makes data processing more efficient, and is easily scalable (Newman 2021). Figure 5. An example of communication through common data Request-Response Communication The request-response communication model is similar to the synchronous communication that was previously discussed — where a microservice provides a request to another microservice and has to await a response. Along with the previously discussed protocols (HTTP, gPRC, etc.), message queues are used as well. Request-response is implemented as one of the following two methods: Blocking synchronous – Microservice A opens a network connection and sends a request to Microservice B along this connection. The established connection stays open while Microservice A waits for Microservice B to respond. Non-blocking asynchronous – Microservice A sends a request to Microservice B, and Microservice Bneeds to know implicitly where to route the response. Also, message queues can be used; they provide an added benefit of buffering multiple requests in the queue to await processing. This method is helpful in situations where the rate of requests received exceeds the rate of handling these requests. Rather than trying to handle more requests than its capacity, the microservice can take its time generating a response before moving on to handle the next request (Newman 2021). Figure 6. An example of request-response non-blocking asynchronous communication Conclusion In recent years, we have observed a paradigm shift from designing large, clunky, monolithic applications that are complex to scale and maintain to using microservices-based architectures that enable the design of distributed applications — ones that can integrate multiple communication patterns and protocols across systems. These complex distributed systems can be developed, deployed, scaled, and maintained independently by different teams with fewer conflicts, resulting in a more robust, reliable, and resilient application. Using the most optimal communication pattern and protocol for the exact operation that a microservice must achieve is a crucial task and has a huge impact on the functionality and performance of an application. The aim is to make the communication between microservices as seamless as possible to establish an efficient system. In-depth knowledge regarding the available communication patterns and protocols is an essential aspect of modern-day cloud-based application design that is not only dynamic but also highly competitive with multiple contenders providing identical applications and services. Speed, scalability, efficiency, security, and other additional features are often crucial in determining the overall quality of an application, and proper microservices communication is the backbone to achieving those capabilities. References: Shrivastava, Saurabh. Shrivastav, Neelanjali. 2022. Solutions Architect's Handbook, 2nd Edition. Packt. Shkuro, Yuri. 2019. Mastering Distributed Tracing. Packt. Newman, Sam. 2021. Building Microservices, 2nd Edition. O'Reilly. This is an excerpt from DZone's 2024 Trend Report, Modern API Management: Connecting Data-Driven Architectures Alongside AI, Automation, and Microservices.Read the Free Report More

Sustainable Java Applications With Quick Warmup

By Dmitry Chuyko

The slow Java startup problem is notorious in the Java community, but the meaning can confuse the observer. The slow startup problem relates to the process of starting a set of interconnected applications on complex Java frameworks. This process includes starting several applications in Spring Boot, and each of them takes around 10 seconds. So the start of such production as a whole will take a minute, but the start of a JVM in this set is 50 milliseconds. The widespread meaning of slow Java startup referred to this process is not exactly true, as technically this is not a Java problem, but a problem of the framework. The effect of slow start-up and warm-up is caused by complex frameworks that we use and dynamic features in the runtime. Java is unique in its functionalities, and thanks to its coding and ecosystem power, Java is very popular among enterprises. The same complexity, though, can make it clumsy in the cloud. Java application startup and warmup technically include several consecutive processes: JVM startup, application startup, and JVM warmup. In these processes, the JVM gets extra time to provide application peak performance. The warmup phase is taken by JVM to compile and optimize the code. This process is needed for code interpretation and optimization and lasts substantially longer than the startup in cases of large complex applications, taking up to several minutes. Every time you start your program, these processes begin from scratch. In practice, it means that we spend time running the application and use significant CPU and memory resources to ensure its performance at the startup point. Therefore, the slow startup and warmup leads to extra resources spent for the phase preparing the application to run rather than the resources that might be required for its operation. Consequently, with the slow startup and warmup, you get increased cloud costs and resource over utilization. Search for the Solutions There are several ways to deal with the issue. Java Optimization Migrating to a newer long-term release (LTS) version of Java can improve application performance slightly, bringing minor changes. Such optimization is a quick method, available immediately. GraalVM Using native images can be beneficial. However, using GraalVM may bring problems such as compilation difficulties, strange errors, and different flags, making it unsuitable for some projects. Project Leyden Its primary goal is to "improve the startup time, time to peak performance, and footprint of Java programs." This project still needs to be completed, and we cannot yet evaluate the effect and possible difficulties of adaptation. However, among all, the Leyden project is designed to solve the problem of slow startup and we follow the news with great expectations on results. Coordinated Restore at Checkpoint It is an OpenJDK project entirely focused on Java startup enhancement. The project's primary aim is to develop a new standard mechanism-agnostic API to notify Java programs about the checkpoint and restore events. Coordinated Restore at Checkpoint (CRaC) offers a Checkpoint/Restore API mechanism solution allowing the creation of an image of a running application at an arbitrary point in time ("checkpoint") and then starting the image from the checkpoint file (snapshot). This process restores the state of an application from the point when the checkpoint was made. Using the CRaC feature with Java runtime enables you to pause the application and restart it from the moment it was paused, and in addition, gives the option to distribute numerous replicas of this file, which is especially relevant for deployment on multiple instances. Amazon Lambda Amazon Lambda is a standalone product based on CRaC technology. Lambda runs your code on a high-availability compute infrastructure and performs all of the administration of the compute resources, including server and operating system maintenance, capacity provisioning, automatic scaling, and logging. Lambdas can be very convenient for your development goals, but they are also more expensive and less effective, compared to the JVMs. The Effectiveness and Your Runtime Sustainability The slow startup problem impacts the overall performance of your runtime, and to make your application sustainable and performant, you need to use one of these solutions. Among the above stated, the CRaC solution is the most popular for the Java community today. CRaC, just like Project Leyden, is targeted to solve the issue of slow startup. We cannot evaluate and test Leyden's results fully yet. The project introduced Class Data Sharing + AOT on steroids, which looks very promising for synergy with Java capable of delivering faster startup on JVM. However, there are no ready-made solutions that can be deployed with Java yet. The advantage of the CRaC feature is that it is already available, and getting spread around quickly. Today, you can get OpenJDK runtime and even containers that support the CRaC API. These solutions are ready to install and allow immediate significant improvements. OpenJDK runtimes and small containers with CRaC support will be especially relevant for Spring developers. Spring announced CRaC feature support in 2023, and their recommended runtime is Liberica JDK, which delivers the runtime version with CRaC. It should be noted that the Native Image technology is also highly relevant for Spring users to reach faster startup of their application. Native images can run with a smaller memory footprint and do not require Java Virtual Machine for deployment. However, GraalVM requires individual research given the specifics of your Java application, and it will not always be suitable for resolving the issue. In the case of Amazon Lambdas, you should consider the costs of this product and its effectiveness, as it might ultimately deliver an extra financial burden. Its main advantage is convenience. The key CRaC advantage today is its availability and ease of use, combined with an instant effect on the application performance and cloud costs. CRaC solves the problem immediately. OpenJDK runtime with support for Coordinated Restore at Checkpoint advances your application with a feature to quickly create and restore images of a running application, reducing the startup and warmup times from minutes to milliseconds. Enhancing your application with Linux-based containers supported with CRaC strengthens its performance even further. CRaC lowers the load on the processor and memory at the application startup, reducing the cloud costs and improving application performance and sustainability. More

Trend Report

Modern API Management

When assessing prominent topics across DZone — and the software engineering space more broadly — it simply felt incomplete to conduct research on the larger impacts of data and the cloud without talking about such a crucial component of modern software architectures: APIs. Communication is key in an era when applications and data capabilities are growing increasingly complex. Therefore, we set our sights on investigating the emerging ways in which data that would otherwise be isolated can better integrate with and work alongside other app components and across systems.For DZone's 2024 Modern API Management Trend Report, we focused our research specifically on APIs' growing influence across domains, prevalent paradigms and implementation techniques, security strategies, AI, and automation. Alongside observations from our original research, practicing tech professionals from the DZone Community contributed articles addressing key topics in the API space, including automated API generation via no and low code; communication architecture design among systems, APIs, and microservices; GraphQL vs. REST; and the role of APIs in the modern cloud-native landscape.

Refcard #395

Open Source Migration Practices and Patterns

By Nuwan Dias

CORE

Open Source Migration Practices and Patterns

Refcard #171

MongoDB Essentials

By Abhishek Gupta

CORE

Implementing EKS Multi-Tenancy Using Capsule (Part 1)

What Is Multi-Tenancy? Tenancy enables users to share cluster infrastructure among: Multiple teams within the organization Multiple customers of the organization Multi-environments of the application Shared clusters save costs and simplify administration. Security and isolation are key factors to consider when cluster resources are to be shared. Two prominent isolation models to achieve multi-tenancy are hard and soft tenancy models. The key difference between these models lies in the level of isolation provided between tenants. Soft tenancy has a lower level of isolation and uses mechanisms like namespaces, quotas, and limits to restrict tenant access to resources and prevent them from interfering with each other while hard tenancy has stronger isolation. Often involves separate clusters or virtual machines for each tenant, with minimal shared resources. Kubernetes Native Services in Multi-Tenant Implementations Kubernetes has a built-in namespace model to create logical partitions of the cluster as isolated slices. Though basic levels of tenancy can be achieved, using namespaces has some limitations: Implementing advanced multi-tenancy scenarios, like Hierarchical Namespaces (HNS) or exposing Container as a Service (CaaS) becomes complicated because of the flat structure of Kubernetes namespaces. Namespaces have no common concept of ownership. Tracking and administration challenges persist if the team controls multiple namespaces. Enforcing resource quotas and limits fairly across all tenants requires additional effort. Only highly privileged users can create namespaces. This means that whenever a team wants a new namespace, they must raise a ticket to the cluster administrator. While this is probably acceptable for small organizations, it generates unnecessary toil as the organization grows. To solve this problem, Kubernetes provides the Hierarchical Namespace Controller (HNC), which allows the user to organize the namespaces into hierarchies. Namespaces are organized in a tree structure, where child namespaces inherit resources and policies from parent namespaces. While HNC supports a soft-tenancy approach leveraging existing namespaces, however, is a newer project still under incubation in the Kubernetes community. Other wide projects that provide similar capabilities are Capsule, Rafay, Kiosk, etc. In this article series, we will discuss implementing multi-tenant solutions using the Capsule framework. Capsule is a commercially supported open-source project that implements multi-tenancy using virtual control planes. Each tenant gets a dedicated control plane with its own API server and etcd instance, creating a virtualized Kubernetes cluster experience. Capsule is one of the recommended platforms by the Kubernetes community for multi-tenancy. Major components of the Capsule framework include: Capsule controller: Aggregates multiple namespaces in a lightweight abstraction called Tenant. Capsule policy engine: Achieves tenant isolation by the various Network and Security Policies, Resource Quota, Limit Ranges, RBAC, and other policies defined at the tenant level. A user who owns the tenant is called a Tenant Owner. There is a small contrast between the roles of a tenant owner and namespace administrator. Listed below are the roles and responsibilities of the cluster admin, the tenant owner, and the namespace administrator. Install Capsule Framework We will use the AWS EKS cluster to perform the exercise. This article assumes you have already created an EKS cluster "eks-cluster1" and the following software is already installed on your local machine. AWS CLI (Version 2) Kubectl (v1.21) Curl (8.1.2) Helm Charts (3.8.2) Go Lang (v1.20.6) Capsule can be installed in the two ways listed below: Using YAML Installer AWS CLI (Version 2) PowerShell aws eks --region us-east-1 update-kubeconfig --name eks-cluster1 kubectl apply -f https://raw.githubusercontent.com/clastix/capsule/master/config/install.yaml If you face any error in applying the YAML file, re-running the same command should fix the problem. If you see the status of the pod as “ImagePullback” or “errImagePull,” delete the pod of deployment (not the deployment). Using Helm As a cluster admin or root user, run the following commands to install using Helm. PowerShell aws eks --region us-east-1 update-kubeconfig --name eks-cluster1 helm repo add clastix https://clastix.github.io/charts helm install capsule clastix/capsule -n capsule-system --create-namespace Verify Capsule Installation What gets installed with the Capsule framework: Namespace: capsule-system Deployments in Namespace: capsule-controller-manager Services Exposed: capsule-controller-manager-metrics-service capsule-webhook-service Secrets in Namespace: capsule-ca capsule-tls Webhooks: In Kubernetes, webhooks are a mechanism for external services to interact with the Kubernetes API server during the lifecycle of API requests. They act like HTTP callbacks, triggered at specific points in the request flow. This allows external services to perform validations or modifications on resources before they are persisted in the cluster. There are two main types of webhooks used in Kubernetes for admission control: Mutating Admission Webhooks and Validating Admission Webhooks. The following webhooks are installed: capsule-mutating-webhook-configuration capsule-validating-webhook-configuration Custom Resource Definitions (CRDs): CRDs allow the user to extend the API and introduce new types of resources beyond the built-in ones. Imagine them as blueprints for creating your own custom resources that can be managed alongside familiar resources like Deployments and Pods. The CRDs below are installed: capsuleconfigurations.capsule.clastix.io globaltenantresources.capsule.clastix.io tenantresources.capsule.clastix.io tenants.capsule.clastix.io Cluster Roles capsule-namespace-deleter capsule-namespace-provisioner Cluster Role Bindings capsule-manager-rolebinding capsule-proxy-rolebinding Follow the below steps to check if Capsule is installed properly: Login in to verify the following commands as a root user or cluster administrator. This should list ‘capsule-system’ namespace. PowerShell aws eks --region us-east-1 update-kubeconfig --name eks-cluster1 kubectl get ns Run the below commands to see capsule-related components. PowerShell kubectl -n capsule-system get deployments kubectl -n capsule-system get svc kubectl -n capsule-system delete deployment capsule-controller-manager kubectl get mutatingwebhookconfigurations kubectl get validatingwebhookconfigurations Get capsule CRDs installed. PowerShell kubectl get crds If any of the CRDs are missing, apply the respective kubectl command mentioned below. Please note the Capsule version in the said URL, your mileage may vary according to the desired upgrading version. PowerShell kubectl apply -f https://raw.githubusercontent.com/clastix/capsule/v0.3.3/charts/capsule/crds/globaltenantresources-crd.yaml kubectl apply -f https://raw.githubusercontent.com/clastix/capsule/v0.3.3/charts/capsule/crds/tenant-crd.yaml kubectl apply -f https://raw.githubusercontent.com/clastix/capsule/v0.3.3/charts/capsule/crds/tenantresources-crd.yaml View the clusterroles and rolesbindings by running the below commands kubectl get clusterrolebindings kubectl get clusterroles Verify the resource utilization of the framework. PowerShell kubectl -n capsule-system get pods kubectl top pod <<pod name>> -n capsule-system --containers The Capsule framework creates one pod replica. The CPU (cores) should be around 3m and Memory (bytes) around 26Mi. Verify the tenants available by running the below command as Cluster admin. The result should be “No Resources Found.” PowerShell kubectl get tenants Summary In this part, we have understood what multi-tenancy is, different types of tenant isolation models, challenges with Kubernetes native services, and installing the Capsule framework on AWS EKS. In the next part, we will further deep-dive into creating tenants and policy management.

By Phani Krishna Kollapur Gandla

How To Use Retrieval Augmented Generation (RAG) for Go Applications

Generative AI development has been democratized, thanks to powerful Machine Learning models (specifically Large Language Models such as Claude, Meta's LLama 2, etc.) being exposed by managed platforms/services as API calls. This frees developers from the infrastructure concerns and lets them focus on the core business problems. This also means that developers are free to use the programming language best suited for their solution. Python has typically been the go-to language when it comes to AI/ML solutions, but there is more flexibility in this area. In this post, you will see how to leverage the Go programming language to use Vector Databases and techniques such as Retrieval Augmented Generation (RAG) with langchaingo. If you are a Go developer who wants to how to build and learn generative AI applications, you are in the right place! If you are looking for introductory content on using Go for AI/ML, feel free to check out my previous blogs and open-source projects in this space. First, let's take a step back and get some context before diving into the hands-on part of this post. The Limitations of LLMs Large Language Models (LLMs) and other foundation models have been trained on a large corpus of data enabling them to perform well at many natural language processing (NLP) tasks. But one of the most important limitations is that most foundation models and LLMs use a static dataset which often has a specific knowledge cut-off (say, January 2022). For example, if you were to ask about an event that took place after the cut-off, date it would either fail to answer it (which is fine) or worse, confidently reply with an incorrect response — this is often referred to as Hallucination. We need to consider the fact that LLMs only respond based on the data they were trained on - it limits their ability to accurately answer questions on topics that are either specialized or proprietary. For instance, if I were to ask a question about a specific AWS service, the LLM may (or may not) be able to come up with an accurate response. Wouldn't it be nice if the LLM could use the official AWS service documentation as a reference? RAG (Retrieval Augmented Generation) Helps Alleviate These Issues It enhances LLMs by dynamically retrieving external information during the response generation process, thereby expanding the model's knowledge base beyond its original training data. RAG-based solutions incorporate a vector store which can be indexed and queried to retrieve the most recent and relevant information, thereby extending the LLM's knowledge beyond its training cut-off. When an LLM equipped with RAG needs to generate a response, it first queries a vector store to find relevant, up-to-date information related to the query. This process ensures that the model's outputs are not just based on its pre-existing knowledge but are augmented with the latest information, thereby improving the accuracy and relevance of its responses. But, RAG Is Not the Only Way Although this post focuses solely on RAG, there are other ways to work around this problem, each with its pros and cons: Task-specific tuning: Fine-tuning large language models on specific tasks or datasets to improve their performance in those domains. Prompt engineering: Carefully designing input prompts to guide language models towards desired outputs, without requiring significant architectural changes. Few-shot and zero-shot learning: Techniques that enable language models to adapt to new tasks with limited or no additional training data. Vector Store and Embeddings I mentioned vector store a few times in the last paragraph. These are nothing but databases that store and index vector embeddings, which are numerical representations of data such as text, images, or entities. Embeddings help us go beyond basic search since they represent the semantic meaning of the source data — hence the word Semantic search, which is a technique that understands the meaning and context of words to improve search accuracy and relevance. Vector databases can also store metadata, including references to the original data source (for example, the URL of a web document) of the embedding. Thanks to generative AI technologies, there has also been an explosion in Vector Databases. These include established SQL and NoSQL databases that you may already be using in other parts of your architecture — such as PostgreSQL, Redis, MongoDB, and OpenSearch. But there are also databases that are custom-built for vector storage. Some of these include Pinecone, Milvus, Weaviate, etc. Alright, let's go back to RAG... What Does a Typical RAG Workflow Look Like? At a high level, RAG-based solutions have the following workflow. These are often executed as a cohesive pipeline: Retrieving data from a variety of external sources like documents, images, web URLs, databases, proprietary data sources, etc. This consists of sub-steps such as chunking which involves splitting up large datasets (e.g. a 100 MB PDF file) into smaller parts (for indexing). Create embeddings: This involves using an embedding model to convert data into numerical representations. Store/Index embeddings in a vector store Ultimately, this is integration as part of a larger application where the contextual data (semantic search result) is provided to LLMs (along with the prompts). End-To-End RAG Workflow in Action Each of the workflow steps can be executed with different components. The ones used in the blog include: PostgreSQL: It will be used as a Vector Database, thanks to the pgvector extension. To keep things simple, we will run it in Docker. langchaingo: It is a Go port of the langchain framework. It provides plugins for various components, including vector store. We will use it for loading data from web URLs and indexing it in PostgreSQL. Text and embedding models: We will use Amazon Bedrock Claude and Titan models (for text and embedding respectively) with langchaingo. Retrieval and app integration: langchaingo vector store (for semantic search) and chain (for RAG). You will get a sense of how these individual pieces work. We will cover other variants of this architecture in subsequent blogs. Before You Begin Make sure you have: Go, Docker and psql (for e.g., using Homebrew if you're on Mac) installed. Amazon Bedrock access configured from your local machine - Refer to this blog post for details. Start PostgreSQL on Docker There is a Docker image we can use! docker run --name pgvector --rm -it -p 5432:5432 -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=postgres ankane/pgvector Activate pgvector extension by logging into PostgreSQL (using psql) from a different terminal: # enter postgres when prompted for password psql -h localhost -U postgres -W CREATE EXTENSION IF NOT EXISTS vector; Load Data Into PostgreSQL (Vector Store) Clone the project repository: git clone https://github.com/build-on-aws/rag-golang-postgresql-langchain cd rag-golang-postgresql-langchain At this point, I am assuming that your local machine is configured to work with Amazon Bedrock The first thing we will do is load data into PostgreSQL. In this case, we will use an existing web page as the source of information. I have used this developer guide — but feel free to use your own! Make sure to change the search query accordingly in the subsequent steps. export PG_HOST=localhost export PG_USER=postgres export PG_PASSWORD=postgres export PG_DB=postgres go run *.go -action=load -source=https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-general-nosql-design.html You should get the following output: loading data from https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-general-nosql-design.html vector store ready - postgres://postgres:postgres@localhost:5432/postgres?sslmode=disable no. of documents to be loaded 23 Give it a few seconds. Finally, you should see this output if all goes well: data successfully loaded into vector store To verify, go back to the psql terminal and check the tables: \d You should see a couple of tables — langchain_pg_collection and langchain_pg_embedding. These are created by langchaingo since we did not specify them explicitly (that's ok, it's convenient for getting started!). langchain_pg_collection contains the collection name while langchain_pg_embedding stores the actual embeddings. | Schema | Name | Type | Owner | |--------|-------------------------|-------|----------| | public | langchain_pg_collection | table | postgres | | public | langchain_pg_embedding | table | postgres | You can introspect the tables: select * from langchain_pg_collection; select count(*) from langchain_pg_embedding; select collection_id, document, uuid from langchain_pg_embedding LIMIT 1; You will see 23 rows in the langchain_pg_embedding table, since that was the number of langchain documents that our web page source was split into (refer to the application logs above when you loaded the data) A quick detour into how this works... The data loading implementation is in load.go, but let's look at how we access the vector store instance (in common.go): brc := bedrockruntime.NewFromConfig(cfg) embeddingModel, err := bedrock.NewBedrock(bedrock.WithClient(brc), bedrock.WithModel(bedrock.ModelTitanEmbedG1)) //... store, err = pgvector.New( context.Background(), pgvector.WithConnectionURL(pgConnURL), pgvector.WithEmbedder(embeddingModel), ) pgvector.WithConnectionURL is where the connection information for PostgreSQL instance is provided pgvector.WithEmbedder is the interesting part, since this is where we can plug in the embedding model of our choice. langchaingo supports Amazon Bedrock embeddings. In this case I have used Amazon Bedrock Titan embedding model. Back to the loading process in load.go. We first get the data in form of a slice of schema.Document (getDocs function) using the langchaingo in-built HTML loader for this. docs, err := documentloaders.NewHTML(resp.Body).LoadAndSplit(context.Background(), textsplitter.NewRecursiveCharacter()) Then, we load it into PostgreSQL. Instead of writing everything by ourselves, we can use the langchaingo vector store abstraction and use the high-level function AddDocuments: _, err = store.AddDocuments(context.Background(), docs) Great. We have set up a simple pipeline to fetch and ingest data into PostgreSQL. Let's make use of it! Execute Semantic Search Let's ask a question. I am going with "What tools can I use to design dynamodb data models?" relevant to this document which I used as the data source — feel free to tune it as per your scenario. export PG_HOST=localhost export PG_USER=postgres export PG_PASSWORD=postgres export PG_DB=postgres go run *.go -action=semantic_search -query="what tools can I use to design dynamodb data models?" -maxResults=3 You should see a similar output — note that we opted to output a maximum of three results (you can change it): vector store ready ============== similarity search results ============== similarity search info - can build new data models from, or design models based on, existing data models that satisfy your application's data access patterns. You can also import and export the designed data model at the end of the process. For more information, see Building data models with NoSQL Workbench similarity search score - 0.3141409 ============================ similarity search info - NoSQL Workbench for DynamoDB is a cross-platform, client-side GUI application that you can use for modern database development and operations. It's available for Windows, macOS, and Linux. NoSQL Workbench is a visual development tool that provides data modeling, data visualization, sample data generation, and query development features to help you design, create, query, and manage DynamoDB tables. With NoSQL Workbench for DynamoDB, you similarity search score - 0.3186116 ============================ similarity search info - key-value pairs or document storage. When you switch from a relational database management system to a NoSQL database system like DynamoDB, it's important to understand the key differences and specific design approaches.TopicsDifferences between relational data design and NoSQLTwo key concepts for NoSQL designApproaching NoSQL designNoSQL Workbench for DynamoDB Differences between relational data design and NoSQL similarity search score - 0.3275382 ============================ Now what you see here are the top three results (thanks to -maxResults=3). Note that this is not an answer to our question. These are the results from our vector store that are semantically close to the query — the keyword here is semantic. Thanks to the vector store abstraction in langchaingo, we were able to easily ingest our source data into PostgreSQL and use the SimilaritySearch function to get the top N results corresponding to our query (see semanticSearch function in query.go): Note that (at the time of writing) the pgvector implementation in langchaingo uses cosine distance vector operation but pgvector also supports L2 and inner product - for details, refer to the pgvector documentation. Ok, so far we have: Loaded vector data Executed semantic search This is the stepping stone to RAG (Retrieval Augmented Generation) - let's see it in action! Intelligent Search With RAG To execute a RAG-based search, we run the same command as above (almost), only with a slight change in the action (rag_search): export PG_HOST=localhost export PG_USER=postgres export PG_PASSWORD=postgres export PG_DB=postgres go run *.go -action=rag_search -query="what tools can I use to design dynamodb data models?" -maxResults=3 Here is the output I got (might be slightly different in your case): Based on the context provided, the NoSQL Workbench for DynamoDB is a tool that can be used to design DynamoDB data models. Some key points about NoSQL Workbench for DynamoDB: - It is a cross-platform GUI application available for Windows, macOS, and Linux. - It provides data modeling capabilities to help design and create DynamoDB tables. - It allows you to build new data models or design models based on existing data models. - It provides features like data visualization, sample data generation, and query development to manage DynamoDB tables. - It helps in understanding the key differences and design approaches when moving from a relational database to a NoSQL database like DynamoDB. So in summary, NoSQL Workbench for DynamoDB seems to be a useful tool specifically designed for modeling and working with DynamoDB data models. As you can see, the result is not just about "Here are the top X responses for your query." Instead, it's a well-formulated response to the question. Let's peek behind the scenes again to see how it works. Unlike, ingestion and semantic search, RAG-based search is not directly exposed by the langchaingo vector store implementation. For this, we use a langchaingo chain which takes care of the following: Invokes semantic search Combines the semantic search with a prompt Sends it to a Large Language Model (LLM), which in this case happens to be Claude on Amazon Bedrock. Here is what the chain looks like (refer to the function ragSearch in query.go): result, err := chains.Run( context.Background(), chains.NewRetrievalQAFromLLM( llm, vectorstores.ToRetriever(store, numOfResults), ), question, chains.WithMaxTokens(2048), ) Let’s Try Another One This was just one example. I tried a different question and increased maxResults to 10, which means that the top 10 results from the vector database will be used to formulate the answer. go run *.go -action=rag_search -query="how is NoSQL different from SQL?" -maxResults=10 The result (again, it might be different for you): Based on the provided context, there are a few key differences between NoSQL databases like DynamoDB and relational database management systems (RDBMS): 1. Data Modeling: - In RDBMS, data modeling is focused on flexibility and normalization without worrying much about performance implications. Query optimization doesn't significantly affect schema design. - In NoSQL, data modeling is driven by the specific queries and access patterns required by the application. The data schema is designed to optimize the most common and important queries for speed and scalability. 2. Data Organization: - RDBMS organizes data into tables with rows and columns, allowing flexible querying. - NoSQL databases like DynamoDB use key-value pairs or document storage, where data is organized in a way that matches the queried data shape, improving query performance. 3. Query Patterns: - In RDBMS, data can be queried flexibly, but queries can be relatively expensive and don't scale well for high-traffic situations. - In NoSQL, data can be queried efficiently in a limited number of ways defined by the data model, while other queries may be expensive and slow. 4. Data Distribution: - NoSQL databases like DynamoDB distribute data across partitions to scale horizontally, and the data keys are designed to evenly distribute the traffic across partitions, avoiding hot spots. - The concept of "locality of reference," keeping related data together, is crucial for improving performance and reducing costs in NoSQL databases. In summary, NoSQL databases prioritize specific query patterns and scalability over flexible querying, and the data modeling is tailored to these requirements, in contrast with RDBMS where data modeling focuses on normalization and flexibility. Where to “Go” From Here? Learning by doing is a good approach. If you've followed along and executed the application thus far, great! I recommend you try out the following: langchaingo has support for lots of different models, including ones in Amazon Bedrock (e.g. Meta LLama 2, Cohere, etc.) — try tweaking the model and see if it makes a difference. Is the output better? What about the Vector Database? I demonstrated PostgreSQL, but langchaingo supports others as well (including OpenSearch, Chroma, etc.) - Try swapping out the Vector store and see how/if the search results differ. You probably get the gist, but you can also try out different embedding models. We used Amazon Titan, but langchaingo also supports many others, including Cohere embed models in Amazon Bedrock. Wrap Up This was a simple example for you to better understand the individual steps in building RAG-based solutions. These might change a bit depending on the implementation, but the high-level ideas remain the same. I used langchaingo as the framework. But this doesn't always mean you have to use one. You could also remove the abstractions and call the LLM platforms APIs directly if you need granular control in your applications or the framework does not meet your requirements. Like most generative AI, this area is rapidly evolving, and I am optimistic about having Go developers have more options to build generative AI solutions. If you've feedback or questions, or you would like me to cover something else around this topic, feel free to comment below! Happy building!

By Abhishek Gupta

CORE

Why Cloud Native Is Vital to Your Organization's APIs: The Impact Could Be More Than Expected

Editor's Note: The following is an article written for and published in DZone's 2024 Trend Report, Modern API Management: Connecting Data-Driven Architectures Alongside AI, Automation, and Microservices. A recent conversation with a fellow staff engineer at a Top 20 technology company revealed that their underlying infrastructure is self-managed and does not leverage cloud-native infrastructure offered by major providers like Amazon, Google, or Microsoft. Hearing this information took me a minute to comprehend given how this conflicts with my core focus on leveraging frameworks, products, and services for everything that doesn't impact intellectual property value. While I understand the pride of a Top 20 technology company not wanting to contribute to the success of another leading technology company, I began to wonder just how successful they could be if they utilized a cloud-native approach. That also made me wonder how many other companies have yet to adopt a cloud-native approach… and the impact it is having on their APIs. Why Cloud? Why Now? For the last 10 years, I have been focused on delivering cloud-native API services for my projects. While cloud adoption continues to gain momentum, a decent percentage of corporations and technology providers still utilize traditional on-premises designs. According to The Cloud in 2021: Adoption Continues report by O'Reilly Media, Figure 1 provides a summary of the state of cloud adoption in December 2021. Figure 1. Cloud technology usage Image adapted from The Cloud in 2021: Adoption Continues, O'Reilly Media Since the total percentages noted in Figure 1 exceed 100%, the underlying assumption is that it is common for respondents to maintain both a cloud and on-premises design. However, for those who are late to enter the cloud native game, I wanted to touch on some common benefits that are recognized with cloud adoption: Focus on delivering or enhancing laser-focused APIs — stop worrying about and managing on-premises infrastructure. Scale your APIs up (and down) as needed to match demand — this is a primary use case for cloud adoption. Reduce risk by expanding your API presence — leverage availability zones, regions, and countries. Describe the supporting API infrastructure as code (IaC) — faster recovery and expandability into new target locations. Making the transition toward cloud native has become easier than ever, with the major providers offering free or discounted trial periods. Additionally, smaller platform-as-a-service (PaaS) providers like Heroku and Render provide solutions that allow teams to focus on their products and services and not worry about the underlying infrastructure design. The Cloud Native Impact on Your API Since this Trend Report is focused on modern API management, I wanted to focus on a few of the benefits that cloud native can have on APIs. Availability and Latency Objectives When providing APIs for your consumers to consume, the concept of service-level agreements (SLAs) is a common onboarding discussion topic. This is basically where expectations are put into easy-to-understand wording that becomes a binding contract between the API provider and the consumer. Failure to meet these expectations can result in fees and, in some cases, legal action. API service providers often take things a step further by establishing service-level objectives (SLOs) that are even more stringent. The goal here is to establish monitors and alerts to remediate issues before they breach contractual SLAs. But what happens when the SLOs and SLAs struggle to be met? This is where the primary cloud native use case can assist. If the increase in latency is due to hardware limitations, the service can be scaled up vertically (by increasing the hardware) or horizontally (by adding more instances). If the increase in latency is driven by geographical location, introducing service instances in closer regions is something cloud native providers can provide to remedy this scenario. API Management As your API infrastructure expands, a cloud-native design provides the necessary tooling to ease supportability and manageability efforts. From an infrastructure perspective, the underlying definition of the service is defined using an IaC approach, allowing the service itself to become defined in a single location. As updates are made to that base design, those changes can be rolled out to each target service instance, avoiding any drift between service instances. From an API management perspective, cloud native providers include the necessary tooling to manage the APIs from a usage perspective. Here, API keys can be established, which offer the ability to impose thresholds on requests that can be made or features that align with service subscription levels. Cloud Native !== Utopia While APIs flourish in cloud native implementations, it is important to recognize that a cloud-native approach is not without its own set of challenges. Cloud Cost Management CloudZero's The State Of Cloud Cost Intelligence 2022 report concluded that only 40% of respondents indicated that their cloud costs were at an expected level as noted in Figure 2. Figure 2. Cloud native cost realities Image adapted from The State Of Cloud Cost Intelligence, CloudZero This means that 60% of respondents are dealing with higher-than-expected cloud costs, which ultimately impact an organization's ability to meet planned objectives. Cloud native spending can often be remediated by adopting the following strategies: Require team-based tags or cloud accounts to help understand levels of spending at a finer grain. Focus on storage buckets and database backups to understand if the cost is in line with the value. Engage a cloud business partner that specializes in cloud spending analysis. Account Takeover The concept of accounts becoming "hacked" is prevalent in social media. At times, I feel like my social media feed contains more "my account was hacked" messages than the casual updates I was tuning in to read. Believe it or not, the concept of account takeover is becoming a common fear for cloud native adopters. Imagine starting your day only to realize you no longer have access to any of your cloud-native services. Soon thereafter, your customers begin to flood your support lines to ask what is going on… and where the data they were expecting to see with each API call is. Another potential consequence is that the APIs are shut down completely, forcing customers to seek out competing APIs. Remember, your account protection is only as strong as your weakest link. Make sure to employ everything possible to protect your account and move away from simple username + password account protection. Disaster Recovery It is also important to recognize that cloud native is not a replacement for maintaining a strong disaster recovery posture. Understand the impact of availability zone and region-wide outages — both are expected to happen. Plan to implement immutable backups — avoid relying on traditional backups and snapshots. Leverage IaC to establish all aspects of cloud native — and test it often. Alternative Flows Exist While a cloud-native approach provides an excellent landscape to help your business and partnerships be successful, there are likely use cases that present themselves as alternative flows for cloud native adoption: Regulatory requirements for a given service can often present themselves as an alternative flow and not be a candidate for cloud native adoption. Point of presence requirements can also become a blocker for cloud native adoption when the closest cloud-native location is not close enough to meet the established SLAs and SLOs. On the Other Side of API Cloud Adoption By adopting a cloud-native approach, it is possible to extend an API across multiple availability zones and geographical regions within a given point of presence. Figure 3. Multi-region cloud native adoption In Figure 3, each region contains an API service instance in three different geographical regions. Additionally, each region contains an API service instance running in three different availability zones — each with its own network and power source. In this example, there are nine distinct instances running across the United States. By introducing a global common name, consumers always receive a service response from the least-latent and available service instance. This approach easily allows for entire regions to be taken offline for disaster recovery validation without any interruptions of service at the consumer level. Conclusion Readers familiar with my work may recall that I have been focused on the following mission statement, which I feel can apply to any IT professional: Focus your time on delivering features/functionality that extend the value of your intellectual property. Leverage frameworks, products, and services for everything else. —John Vester When I think about my conversion with the staff engineer at the Top 20 tech company, I wonder how much more successful his team would be without having to worry about the underlying infrastructure being managed with their on-premises approach. While the other side of cloud native is not without challenges, it does adhere to my mission statement. As a result, projects that I have worked on for the last 10 years have been able to remain focused on meeting the needs of API consumers while staying in line with corporate objectives. From an API perspective, cloud native offers additional ways to adhere to my personal mission statement by describing everything related to the service using IaC and leveraging built-in tooling to manage the APIs across different availability zones and regions. Have a really great day! This is an excerpt from DZone's 2024 Trend Report, Modern API Management: Connecting Data-Driven Architectures Alongside AI, Automation, and Microservices.Read the Free Report

By John Vester

CORE

Top 10 Essential Linux Commands

As a Linux administrator or even if you are a newbie who just started using Linux, having a good understanding of useful commands in troubleshooting network issues is paramount. We'll explore the top 10 essential Linux commands for diagnosing and resolving common network problems. Each command will be accompanied by real-world examples to illustrate its usage and effectiveness. 1. ping Example: ping google.com Shell test@ubuntu-server ~ % ping google.com -c 5 PING google.com (142.250.189.206): 56 data bytes 64 bytes from 142.250.189.206: icmp_seq=0 ttl=58 time=14.610 ms 64 bytes from 142.250.189.206: icmp_seq=1 ttl=58 time=18.005 ms 64 bytes from 142.250.189.206: icmp_seq=2 ttl=58 time=19.402 ms 64 bytes from 142.250.189.206: icmp_seq=3 ttl=58 time=22.450 ms 64 bytes from 142.250.189.206: icmp_seq=4 ttl=58 time=15.870 ms --- google.com ping statistics --- 5 packets transmitted, 5 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 14.610/18.067/22.450/2.749 ms test@ubuntu-server ~ % Explanation ping uses ICMP protocol, where ICMP stands for internet control message protocol and ICMP is a network layer protocol used by network devices to communicate. ping helps in testing the reachability of the host and it will also help in finding the latency between the source and destination. 2. traceroute Example: traceroute google.com Shell test@ubuntu-server ~ % traceroute google.com traceroute to google.com (142.250.189.238), 64 hops max, 52 byte packets 1 10.0.0.1 (10.0.0.1) 6.482 ms 3.309 ms 3.685 ms 2 96.120.90.197 (96.120.90.197) 13.094 ms 10.617 ms 11.351 ms 3 po-301-1221-rur01.fremont.ca.sfba.comcast.net (68.86.248.153) 12.627 ms 11.240 ms 12.020 ms 4 ae-236-rar01.santaclara.ca.sfba.comcast.net (162.151.87.245) 18.902 ms 44.432 ms 18.269 ms 5 be-299-ar01.santaclara.ca.sfba.comcast.net (68.86.143.93) 14.826 ms 13.161 ms 12.814 ms 6 69.241.75.42 (69.241.75.42) 12.236 ms 12.302 ms 69.241.75.46 (69.241.75.46) 15.215 ms 7 * * * 8 142.251.65.166 (142.251.65.166) 21.878 ms 14.087 ms 209.85.243.112 (209.85.243.112) 14.252 ms 9 nuq04s39-in-f14.1e100.net (142.250.189.238) 13.666 ms 192.178.87.152 (192.178.87.152) 12.657 ms 13.170 ms test@ubuntu-server ~ % Explanation Traceroute shows the route packets take to reach a destination host. It displays the IP addresses of routers along the path and calculates the round-trip time (RTT) for each hop. Traceroute helps identify network congestion or routing issues. 3. netstat Example: netstat -tulpn Shell test@ubuntu-server ~ % netstat -tuln Active LOCAL (UNIX) domain sockets Address Type Recv-Q Send-Q Inode Conn Refs Nextref Addr aaf06ba76e4d0469 stream 0 0 0 aaf06ba76e4d03a1 0 0 /var/run/mDNSResponder aaf06ba76e4d03a1 stream 0 0 0 aaf06ba76e4d0469 0 0 aaf06ba76e4cd4c1 stream 0 0 0 aaf06ba76e4ccdb9 0 0 /var/run/mDNSResponder aaf06ba76e4cace9 stream 0 0 0 aaf06ba76e4c9e11 0 0 /var/run/mDNSResponder aaf06ba76e4d0b71 stream 0 0 0 aaf06ba76e4d0aa9 0 0 /var/run/mDNSResponder test@ubuntu-server ~ % Explanation Netstat displays network connections, routing tables, interface statistics, masquerade connections, and multicast memberships. It's useful for troubleshooting network connectivity, identifying open ports, and monitoring network performance. 4. ifconfig/ip Example: ifconfig or ifconfig <interface name> Shell test@ubuntu-server ~ % ifconfig en0 en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500 options=6460<TSO4,TSO6,CHANNEL_IO,PARTIAL_CSUM,ZEROINVERT_CSUM> ether 10:9f:41:ad:91:60 inet 10.0.0.24 netmask 0xffffff00 broadcast 10.0.0.255 inet6 fe80::870:c909:df17:7ed1%en0 prefixlen 64 secured scopeid 0xc inet6 2601:641:300:e710:14ef:e605:4c8d:7e09 prefixlen 64 autoconf secured inet6 2601:641:300:e710:d5ec:a0a0:cdbb:79a7 prefixlen 64 autoconf temporary inet6 2601:641:300:e710::6cfc prefixlen 64 dynamic nd6 options=201<PERFORMNUD,DAD> media: autoselect status: active test@ubuntu-server ~ % Explanation ifconfig and ip commands are used to view and configure network parameters. They provide information about the IP address, subnet mask, MAC address, and network status of each interface. 5. tcpdump Example:tcpdump -i en0 tcp port 80 Shell test@ubuntu-server ~ % tcpdump -i en0 tcp port 80 tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on en0, link-type EN10MB (Ethernet), snapshot length 524288 bytes 0 packets captured 55 packets received by filter 0 packets dropped by kernel test@ubuntu-server ~ % Explanation Tcpdump is a packet analyzer that captures and displays network traffic in real-time. It's invaluable for troubleshooting network issues, analyzing packet contents, and identifying abnormal network behavior. Use tcpdump to inspect packets on specific interfaces or ports. 6. nslookup/dig Example: nslookup google.com or dig Shell test@ubuntu-server ~ % nslookup google.com Server: 2001:558:feed::1 Address: 2001:558:feed::1#53 Non-authoritative answer: Name: google.com Address: 172.217.12.110 test@ubuntu-server ~ % test@ubuntu-server ~ % dig google.com ; <<>> DiG 9.10.6 <<>> google.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46600 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;google.com. IN A ;; ANSWER SECTION: google.com. 164 IN A 142.250.189.206 ;; Query time: 20 msec ;; SERVER: 2001:558:feed::1#53(2001:558:feed::1) ;; WHEN: Mon Apr 15 22:55:35 PDT 2024 ;; MSG SIZE rcvd: 55 test@ubuntu-server ~ % Explanation nslookup and dig are DNS lookup tools used to query DNS servers for domain name resolution. They provide information about the IP address associated with a domain name and help diagnose DNS-related problems such as incorrect DNS configuration or server unavailability. 7. iptables/firewalld Example: iptables -L or firewall-cmd --list-all Shell test@ubuntu-server ~# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy DROP) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination test@ubuntu-server ~# Explanation iptables and firewalld are firewall management tools used to configure packet filtering and network address translation (NAT) rules. They control incoming and outgoing traffic and protect the system from unauthorized access. Use them to diagnose firewall-related issues and ensure proper traffic flow. 8. ss Example: ss -tulpn Shell test@ubuntu-server ~# Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port udp UNCONN 0 0 *:161 *:* udp UNCONN 0 0 *:161 *:* test@ubuntu-server ~# Explanation ss is a utility to investigate sockets. It displays information about TCP, UDP, and UNIX domain sockets, including listening and established connections, connection state, and process IDs. ss is useful for troubleshooting socket-related problems and monitoring network activity. 9. arp Example: arp -a Shell test@ubuntu-server ~ % arp -a ? (10.0.0.1) at 80:da:c2:95:aa:f7 on en0 ifscope [ethernet] ? (10.0.0.57) at 1c:4d:66:bb:49:a on en0 ifscope [ethernet] ? (10.0.0.83) at 3a:4a:df:fe:66:58 on en0 ifscope [ethernet] ? (10.0.0.117) at 70:2a:d5:5a:cc:14 on en0 ifscope [ethernet] ? (10.0.0.127) at fe:e2:1c:4d:b3:f7 on en0 ifscope [ethernet] ? (10.0.0.132) at bc:d0:74:9a:51:85 on en0 ifscope [ethernet] ? (10.0.0.255) at ff:ff:ff:ff:ff:ff on en0 ifscope [ethernet] mdns.mcast.net (224.0.0.251) at 1:0:5e:0:0:fb on en0 ifscope permanent [ethernet] ? (239.255.255.250) at 1:0:5e:7f:ff:fa on en0 ifscope permanent [ethernet] test@ubuntu-server ~ % Explanation arp (Address Resolution Protocol) displays and modifies the IP-to-MAC address translation tables used by the kernel. It resolves IP addresses to MAC addresses and vice versa. arp is helpful for troubleshooting issues related to network device discovery and address resolution. 10. mtr Example: mtr Shell test.ubuntu.com (0.0.0.0) Tue Apr 16 14:46:40 2024 Keys: Help Display mode Restart statistics Order of fields quit Packets Ping Host Loss% Snt Last Avg Best Wrst StDev 1. 10.0.0.10 0.0% 143 0.8 9.4 0.7 58.6 15.2 2. 10.0.2.10 0.0% 143 0.8 9.4 0.7 58.6 15.2 3. 192.168.0.233 0.0% 143 0.8 9.4 0.7 58.6 15.2 4. 142.251.225.178 0.0% 143 0.8 9.4 0.7 58.6 15.2 5. 142.251.225.177 0.0% 143 0.8 9.4 0.7 58.6 15.2 Explanation mtr (My traceroute) combines the functionality of ping and traceroute into a single diagnostic tool. It continuously probes network paths between the host and a destination, displaying detailed statistics about packet loss, latency, and route changes. Mtr is ideal for diagnosing intermittent network problems and monitoring network performance over time. Mastering these commands comes in handy for troubleshooting network issues on Linux hosts.

By Prashanth Ravula

Spring Boot Timeout Handling With RestClient, WebClient, and RestTemplate

In modern web applications, integrating with external services is a common requirement. However, when interacting with these services, it's crucial to handle scenarios where responses might be delayed or fail to arrive. Spring Boot, with its extensive ecosystem, offers robust solutions to address such challenges. In this article, we'll explore how to implement timeouts using three popular approaches: RestClient, RestTemplate, and WebClient, all essential components in Spring Boot. 1. Timeout With RestTemplate First, let's demonstrate setting a timeout using RestTemplate, a synchronous HTTP client. Java import org.springframework.web.client.RestTemplate; import org.springframework.http.ResponseEntity; import org.springframework.http.HttpStatus; public class RestTemplateExample { public static void main(String[] args) { var restTemplate = new RestTemplate(); var url = "https://api.example.com/data"; var timeout = 5000; // Timeout in milliseconds restTemplate.getForEntity(url, String.class); System.out.println(response.getBody()); } } In this snippet, we're performing a GET request to `https://api.example.com/data`. However, we haven't set any timeout, which means the request might hang indefinitely in case of network issues or server unavailability. To set a timeout, we need to configure RestTemplate with an appropriate `ClientHttpRequestFactory`, such as `HttpComponentsClientHttpRequestFactory`. Java import org.springframework.web.client.RestTemplate; import org.springframework.http.client.HttpComponentsClientHttpRequestFactory; import org.springframework.http.ResponseEntity; import org.springframework.http.HttpStatus; public class RestTemplateTimeoutExample { public static void main(String[] args) { var url = "https://api.example.com/data"; var timeout = 5000; var clientHttpRequestFactory = new HttpComponentsClientHttpRequestFactory(); clientHttpRequestFactory.setConnectTimeout(timeout); clientHttpRequestFactory.setConnectionRequestTimeout(timeout); var restTemplate = new RestTemplate(clientHttpRequestFactory); restTemplate.getForEntity(url, String.class); System.out.println(response.getBody()); } } 2. Timeout With WebClient WebClient is a non-blocking, reactive HTTP client introduced in Spring WebFlux. Let's see how we can use it with a timeout: Java import org.springframework.web.reactive.function.client.WebClient; import java.time.Duration; public class WebClientTimeoutExample { public static void main(String[] args) { var client = WebClient.builder() .baseUrl("https://api.example.com") .build(); client.get() .uri("/data") .retrieve() .bodyToMono(String.class) .timeout(Duration.ofMillis(5000)) .subscribe(System.out::println); } } Here, we're using WebClient to make a GET request to `/data` endpoint. The `timeout` operator specifies a maximum duration for the request to wait for a response. 3. Timeout With RestClient RestClient is a synchronous HTTP client that offers a modern, fluent API since Spring Boot 3.2. New Spring Boot applications should replace RestTemplate code with RestClient API. Now, let's implement a RestClient with timeout using `HttpComponentsClientHttpRequestFactory`: Java import org.springframework.http.client.HttpComponentsClientHttpRequestFactory; import org.springframework.web.client.RestTemplate; import org.springframework.web.reactive.function.client.WebClient; import java.time.Duration; public class RestClientTimeoutExample { public static void main(String[] args) { var factory = new HttpComponentsClientHttpRequestFactory(); factory.setConnectTimeout(5000); factory.setReadTimeout(5000); var restClient = RestClient .builder() .requestFactory(clientHttpRequestFactory) .build(); var response = restClient .get() .uri("https://api.example.com/data") .retrieve() .toEntity(String.class); System.out.println(response.getBody()); } } In this code, we define a specified timeout using HttpComponentsClientHttpRequestFactory and use it in RestClient.builder(). By setting timeouts appropriately, we ensure that our application remains responsive even in scenarios where external services are slow or unresponsive. This proactive approach enhances the overall reliability and resilience of our Spring Boot applications. Conclusion In summary, handling timeouts is important for web apps to stay responsive and robust during interactions with external services. We explored three popular Spring Boot approaches for implementing timeouts effectively: RestTemplate, WebClient, and RestClient. By setting appropriate timeouts, developers can ensure applications gracefully handle delayed or failed responses and enhance overall reliability and user experience in network conditions and service availability.

By Fernando Boaglio

If Software Quality Is Everybody’s Responsibility, So Is Failure

In many large organizations, software quality is primarily viewed as the responsibility of the testing team. When bugs slip through to production, or products fail to meet customer expectations, testers are the ones blamed. However, taking a closer look, quality — and likewise, failure — extends well beyond any one discipline. Quality is a responsibility shared across an organization. When quality issues arise, the root cause is rarely something testing alone could have prevented. Typically, there were breakdowns in communication, unrealistic deadlines, inadequate design specifications, insufficient training, or corporate governance policies that incentivized rushing. In other words, quality failures tend to stem from broader organizational and leadership failures. Scapegoating testers for systemic issues is counterproductive. It obscures the real problems and stands in the way of meaningful solutions to quality failings. Testing in Isolation In practice, all too often, testing teams still work in isolation from the rest of the product development lifecycle. They are brought in at the end, given limited information, and asked to validate someone else’s work. Under these conditions, their ability to prevent defects is severely constrained. For example, without access to product requirement documents, test cases may overlook critical functions that need validation. With short testing timelines, extensive test coverage further becomes impossible. Without insight into design decisions or access to developers, some defects found in testing prove impossible to diagnose effectively. Testers are often parachuted in when the time and cost of repairing a defect has grown to be unfeasible. In this isolated model, testing serves as little more than a final safety check before release. The burden of quality is passed almost entirely to the testers. When the inevitable bugs still slip through, testers then make for easy scapegoats. Who Owns Software Quality? In truth, responsibility for product quality is distributed across an organization. So, what can you do? Quality is everyone’s responsibility. Image sources: Kharnagy (Wikipedia), under CC BY-SA 4.0 license, combined with an image from Pixabay. Executives and leadership teams — Set the tone and policies around quality, balancing it appropriately against other priorities like cost and schedule. Meanwhile, provide the staffing, resources, and timescale needed for a mature testing effort. Product Managers — Gather user requirements, define expected functionality, and support test planning. Developers — Follow secure coding practices, perform unit testing, enable automated testing, and respond to defects uncovered in testing. User experience designers — Consider quality and testability during UX design. Conduct user acceptance testing on prototypes. Information security — Perform security reviews of code, architectures, and configurations. Guide testing-relevant security use cases. Testers — Develop test cases based on user stories, execute testing, log defects, perform regression test fixes, and report on quality to stakeholders. Operations — Monitor systems once deployed, gather production issues, and report data to inform future testing. Customers — Voice your true quality expectations, participate in UAT, and report real-world issues once launched. As this illustrates, no one functional area owns quality alone. Testers contribute essential verification, but quality is truly everyone’s responsibility. Governance Breakdowns Lead to Quality Failures In a 2023 episode of the "Why Didn’t You Test That?" podcast, Marcus Merrell, Huw Price, and I discussed how testing remains treated as a “janitorial” effort and cost center, and how you can align testing and quality. When organizations fail to acknowledge the shared ownership of software quality, governance issues arise that enable quality failures: Unrealistic deadlines — Attempting to achieve overly aggressive schedules often comes at the expense of quality and sufficient testing timelines. Leadership teams must balance market demands against release readiness. Insufficient investment — Success requires appropriate staffing and support for all areas that influence quality. These range from design and development to development to testing. Underinvestment leads to unhealthy tradeoffs. Lack of collaboration — Cross-functional coordination produces better quality than work done in silos. Governance policies should foster collaboration across product teams, not hinder it. Misaligned priorities — Leadership should incentivize balanced delivery, not just speed or cost savings. Quality cannot be someone else’s problem. Lack of transparency — Progress reporting should incorporate real metrics on quality. Burying or obscuring defects undermines governance. Absence of risk management — Identifying and mitigating quality risks through appropriate action requires focus from project leadership. Lacking transparency about risk prevents proper governance. When these governance breakdowns occur, quality suffers, and failures follow. However, the root causes trace back to organizational leadership and culture, not solely the testing function. The Costs of Obscuring Systemic Issues Blaming testers for failures caused by systemic organizational issues leads to significant costs: Loss of trust — When testers become scapegoats, it erodes credibility and trust in the testing function, inhibiting their ability to advocate for product quality. Staff turnover — Testing teams experience higher turnover when the broader organization fails to recognize their contributions and value. Less collaboration — Other groups avoid collaborating with testers perceived as bottlenecks or impediments rather than partners. Reinventing the wheel — Lessons from past governance breakdowns go unlearned, leading those issues to resurface in new forms down the line. Poorer customer experiences — Ultimately, obscuring governance issues around quality leads to more negative customer experiences that damage an organization’s reputation and bottom line. Taking Ownership of Software Quality Elevating quality as an organization-wide responsibility is essential for governance, transparency, and risk management. Quality cannot be the burden of one isolated function, and leadership should foster a culture that values quality intrinsically, rather than viewing it as an afterthought or checkbox. To build ownership, organizations need to shift testing upstream, integrating it earlier into requirements planning, design reviews, and development processes. It also requires modernizing the testing practice itself, utilizing the full range of innovation available: from test automation, shift-left testing, and service virtualization, to risk-based test case generation, modeling, and generative AI. With a shared understanding of who owns quality, governance policies can better balance competing demands around cost, schedule, capabilities, and release readiness. Testing insights will inform smarter tradeoffs, avoiding quality failures and the finger-pointing that today follows them. This future state reduces the likelihood of failures — but also acknowledges that some failures will still occur despite best efforts. In these cases, organizations must have a governance model to transparently identify root causes across teams, learn from them, and prevent recurrence. In a culture that values quality intrinsically, software testers earn their place as trusted advisors, rather than get relegated to fault-finders. They can provide oversight and validation of other teams’ work without fear of backlash. And their expertise will strengthen rather than threaten collaborative delivery. With shared ownership, quality ceases to be a “tester problem” at all. It becomes an organizational value that earns buy-in across functional areas. Leadership sets the tone for an understanding that if quality is everyone’s responsibility — so too is failure.

By Rich Jordan

Develop XR With Oracle, Ep 7: Teleport Live 3D Objects From Ray-Bans to Magic Leap, Quest, Vision Pro, and 3D Printers

The objective behind the solution described in this blog was to be able to share live 3D objects captured by one person, using “normal-looking” glasses, with another person who can then view them in XR (AR, VR, or MR) and/or 3D print them and do so with a similar experience as to what exists today for 2D pictures and 2D printers. About This Project While Meta Ray-Ban glasses are not XR headsets (they are smart glasses), they are currently the most unobtrusive, aesthetically “normal” looking glasses on the market that can be used to capture video (which can then be turned into 3D objects via an intermediary service) as they are indistinguishable from regular Ray-Ban Headliner and Wayfarer glasses aside from the small camera lenses that are mere millimeters in size. See the side-by-side comparison below. Ray-Ban Wayfarer glasses Meta Ray-Ban Wayfarer glasses The Oracle database plays a central role in the solution as it provides an optimized location for all types of data storage (including 3D objects and point clouds), various inbound and outbound API calls, and AI and spatial operations. Details can be found here. I will start by saying that this process will of course be more streamlined in the future as better hardware and software tech and APIs become available; however, the need for workflow logic, interaction with and exposure of APIs, and central storage will remain a consistent requirement of an optimal architecture for the functionality. It is possible to run both Java and JavaScript from within the Oracle database, load and use libraries for those languages, expose these programs as API endpoints, and make calls out from them. It is also possible to simply issue direct HTTP, REST, etc., commands from PL/SQL using the UTL_HTTP.BEGIN_REQUEST call or, for Oracle AI cloud services (or any Oracle cloud services), by using the DBMS_CLOUD.send_request call. This offers a powerful and flexible architecture where the following four combinations are possible. This being the case, there are several ways to go about the solution described here; for example, by issuing requests directly from the database or an intermediary external application (such as microservices deployed in a Kubernetes cluster) as shown in the previous diagrams. Flow The flow is as follows: The user takes a video with Ray-Bans. The video is automatically sent to Instagram (or Cloud Storage). The Oracle database calls Instagram to get the video and saves it in the database (or object storage, etc.). The Oracle database sends video to the photogrammetry/AI service and retrieves the 3D model/capture generated by it. Optionally, further spatial and AI operations are automatically conducted on the 3D model by the Oracle database. Optionally, further manual modifications are made to the 3D model and a manual workflow step may be added (for example, to gate 3D printing). From here the 3D capture/model can be 3D printed or viewed and interacted with via an XR (VR, AR, MR) headset - or both can be done in parallel. 3D Printing The Oracle database sends the 3D model (.obj file) to PrusaSlicer, which generates and returns G-code from it. The G-code print job is then sent to 3D printer via OctoPrint API server. XR Viewing and Interaction The 3D model is exposed as REST (ORDS) endpoint. The XR headset (Magic Leap 2, Vision Pro, Quest, etc.) receives the 3D model from Oracle database and renders it for viewing and interaction at runtime. In diagram form, the flow looks roughly like this: Now let’s elaborate on each step. Step 1: The User Takes a Video With Ray-Bans As mentioned earlier, I did not pick Meta Ray-Bans due to their XR functionality. Numerous other glasses have actual XR functionality well beyond Ray-Bans, full-on XR headsets with an increasingly better ability to do 3D scans of various types, and 3D scanners devoted to extremely accurate, high-resolution scans. I picked Ray-Bans because they are the glasses that are, in short, the most “normal” looking (without thick lenses or bridges that sit far from the face or extra extensions, etc.). Meta Ray-Bans have a “hey Meta” command that works like Alexa or Siri, though fairly limited at this point, as it can not refer to location services, can send but not read messages, etc. It’s not hard to see how it is possible to use Vision AI, etc. with them. However, built-in functionality does not exist currently, and more importantly, there is no API to access any functionality (there are access hacks out there but this blog will stick to legit, supported approaches), so it is limited for developers at this point. It can play music and, most importantly, take pictures and videos — that is the functionality I am using here. Streaming must be set up on the Meta View app and the Instagram account being streamed to must be a business account. However, both of these are simple configuration steps that can be found in the documentation and do not require additional cost. Step 2: Video Is Automatically Sent to Instagram (Or Cloud Storage) Ray-Ban video recording is limited to one-minute clips, but that is enough for any modern photogrammetry/AI engine to generate a 3D model of small to medium objects. Video taken with the glasses can be stored in cloud services such as iCloud and Google. However, it is not automatically synced until the glasses are placed in the glass case. This is why I opted for storage in Instagram reels, which, not surprisingly, is supported by the Meta Ray-Bans such that videos can be automatically streamed and saved there as they are taken. Setup steps to stream can be found here. 3. Oracle Database Calls Instagram (App) to Get the Video Here, the Oracle database itself listens/polls for new Instagram reels/videos, using the Instagram Graph API. This requires creating a Meta/Instagram application, etc., and here are the steps involved in doing so. Register as a Meta developer. Create an app and submit it for approval. This takes approximately 5 days if everything has been completed correctly for eligibility. This process, in particular for the Instagram Basic Display app type we are creating, is described on this Getting Started page. However, I will provide a few additional screenshots here to elaborate a bit as certain new items around app types, privileges, and app approval processes are missing from the doc. First, it is necessary to select a Consumer app type and then the Instagram Basic Display product. Then the app is submitted for approval for the instagram_graph_user_profile and instagram_graph_user_media: Finally, testers are added/invited for access tokens to be generated. Once the application is set up and access token(s) acquired, a list of the information about the media from the account is obtained by issuing a request in the following format: https://graph.instagram.com/{{IG_USER_ID}/media?fields=id,caption,media_url,media_type,timestamp,children{media_url}&limit=10&access_token={{IG_LONG_LIVED_ACCESS_TOKEN} Finally, the media desired is filtered out from that JSON returned (i.e., any new videos posted), and the media_url can be used to get the actual media. As stated before, the video can be retrieved from the URL using PL/SQL, JavaScript, or Java from within the Oracle database itself, or via an intermediary service called from the database. It can then be saved in the database, object storage, or other storage and sent to the photogrammetry/AI service. An example of doing this with JavaScript from inside the database can be found in my blog How To Call Cohere and Hugging Face AI From Within an Oracle Database Using JavaScript and an example using PL/SQL is presented here: PLSQL CREATE TABLE file_storage ( id NUMBER PRIMARY KEY, filename VARCHAR2(255), file_content BLOB ); DECLARE l_http_request UTL_HTTP.req; l_http_response UTL_HTTP.resp; l_blob BLOB; l_buffer RAW(32767); l_amount BINARY_INTEGER := 32767; l_pos INTEGER := 1; l_url VARCHAR2(1000) := 'https://somesite.com/somefile.obj'; BEGIN INSERT INTO file_storage (id, filename, file_content) VALUES (1, 'file.obj', EMPTY_BLOB()) RETURNING file_content INTO l_blob; -- Open HTTP request to download file l_http_request := UTL_HTTP.begin_request(l_url); UTL_HTTP.set_header(l_http_request, 'User-Agent', 'Mozilla/4.0'); l_http_response := UTL_HTTP.get_response(l_http_request); -- Download the file and write it to the BLOB LOOP BEGIN UTL_HTTP.read_raw(l_http_response, l_buffer, l_amount); DBMS_LOB.writeappend(l_blob, l_amount, l_buffer); l_pos := l_pos + l_amount; EXCEPTION WHEN UTL_HTTP.end_of_body THEN EXIT; END; END LOOP; UTL_HTTP.end_response(l_http_response); COMMIT; DBMS_OUTPUT.put_line('File downloaded and saved.'); EXCEPTION WHEN UTL_HTTP.end_of_body THEN UTL_HTTP.end_response(l_http_response); WHEN OTHERS THEN UTL_HTTP.end_response(l_http_response); RAISE; END; This same technique can be used for any call out from the database and storing of any file/content. Therefore, these snippets can be referred back to for saving any file including the .obj file(s) generated in the next step. Step 4: Oracle Database Sends Video to the Photogrammetry/AI Service and Retrieves the 3D Model/Capture Generated by It There are a few photogrammetry/AI (and Nerf, Splat, etc.) services available. I have chosen to use Luma Labs again because it has an API available for direct HTTPS calls, and examples are also given for over 20 programming languages and platforms. The reference for it can be found here. I will keep things short by giving the succinct curl command for each call in the flow, but the same can be done using PL/SQL, JavaScript, etc. from the database as described earlier. Once a Luma Labs account is created and luma-api-key created, the process of converting the video to a 3D .obj file is as follows: Create/initiate a capture. Shell curl --location 'https://webapp.engineeringlumalabs.com/api/v2/capture' \ --header 'Authorization: luma-api-key={key}' \ --data-urlencode 'title=hand' # example response # { # "signedUrls": { # "source": "https://storage.googleapis.com/..." # }, # "capture": { # "title": "hand", # "type": "reconstruction", # "location": null, # "privacy": "private", # "date": "2024-03-26T15:54:08.268Z", # "username": "paulparkinson", # "status": "uploading", # "slug": "pods-of-kon-66" # } # } This call will return a signedUrls.source URL that is then used to upload the video. Also note the generated slug value returned which will be used to trigger 3D processing, check status processing status, etc. Shell curl --location --request PUT 'https://storage.googleapis.com/...' \ --header 'Content-Type: text/plain' \ --data 'hand.mov' Once the video file is uploaded, the processing is triggered by issuing a POST request to the slug retrieved in step 1. Shell curl --location -g --request POST 'https://webapp.engineeringlumalabs.com/api/v2/capture/{slug}' \ --header 'Authorization: luma-api-key={key}' If the process is triggered successfully, a value of true will be returned and the following can be issued to check the status of the capture by calling the capture endpoint. Shell curl --location -g 'https://webapp.engineeringlumalabs.com/api/v2/capture/{slug}' \ --header 'Authorization: luma-api-key={key}' Once the status returned is equal to complete, the 3D capture zip file (which contains the .obj file as well as the .mtl material mapping file and .png texture files) is downloaded and saved by calling the download endpoint. The approaches mentioned earlier can be used to do this and save the file(s). Step 5: Optionally, Further Spatial and AI Operations Are Automatically Conducted on the 3D Model by the Oracle Database It is also possible to break down the .obj file and store its various vertices, vertice texture/material mappings, etc. in a table as a point cloud for analysis and manipulation. Here is a simple example of that: PLSQL create or replace procedure gen_table_from_obj(id number) as ord MDSYS.SDO_ORDINATE_ARRAY; f UTL_FILE.FILE_TYPE; s VARCHAR2(2000); i number; begin ord := MDSYS.SDO_ORDINATE_ARRAY(); ord.extend(3); f := UTL_FILE.FOPEN('ADMIN_DIR','OBJFROMPHOTOAI.obj', 'R'); i := 1; while true loop UTL_FILE.GET_LINE(f, s); if(s = '') then exit; end if; if(REGEXP_SUBSTR(s, '[^ ]*', 1, 1) = 'v') then ord(1) := TO_NUMBER(REGEXP_SUBSTR(s, '[^ ]*', 1, 3)); ord(2) := TO_NUMBER(REGEXP_SUBSTR(s, '[^ ]*', 1, 5)); ord(3) := TO_NUMBER(REGEXP_SUBSTR(s, '[^ ]*', 1, 7)); insert into INP_OBJFROMPHOTOAI_TABLE(val_d1, val_d2, val_d3) values (ord(1), ord(2), ord(3)); end if; i := i+1; end loop; end; / The Oracle database has had a spatial component for decades now, and recent versions have added several operations for different analyses of point clouds, mesh creation, .obj export, etc. These are described in this video. One operation that has existed for a number of releases is the pc_simplify function shown below. This is often referred to as "decimate" or other terms by various 3D modeling tools and provides the ability to reduce the number of polygons in a mesh, thus reducing overall size. This is handy for a number of reasons, such as when different clients will use the 3D model: for example, a phone with limited bandwidth or need for high-poly meshes. PLSQL procedure pc_simplify( pc_table varchar2, pc_column varchar2, id_column varchar2, id varchar2, result_table_name varchar2, tol number, query_geom mdsys.sdo_geometry default null, pc_intensity_column varchar2 default null) DETERMINISTIC PARALLEL_ENABLE; Step 6: Optionally, Further Manual Modifications Can Also Be Made to the 3D Model, and a Manual Approval Can Be Inserted as Part of the Workflow The 3D model can be loaded from the database, and edited in 3D modeling tools like Blender or 3D printing tools such as BambuLabs numerous others. Due to the many steps involved in this overall process, the solution is also a good fit for a workflow engine such as the one that exists as part of the Oracle database. In this case, a manual review/approval can be inserted as part of the workflow to prevent sending or printing undesired models. From here the 3D capture/model can be 3D printed or viewed and interacted with via an XR (VR, AR, MR) headset - or both can be done in parallel. 3D Printing Step 1: Oracle Database Sends the 3D Model (.Obj File) to PrusaSlicer Which Generates and Returns G-Code From It PrusaSlicer is an extremely robust and successful open-source project/application that takes 3D models (.stl, .obj, .amf) and converts them into G-code instructions for FFF printers or PNG layers for mSLA 3D printers. It supports every conceivable printer and format; however, does not provide an API, only a CLI. There are a few ways to work/hack around this for automation. One, shown here, is to implement a [Spring Boot] microservice that takes the .obj file and executes the PrusaSlicer CLI (which must be accessible to the microservice of course), returning the G-code. Java @RestController public class SlicerController { @PostMapping(value = "/slice", consumes = MediaType.MULTIPART_FORM_DATA_VALUE) public byte[] sliceStlFile(@RequestParam("file") MultipartFile file, @RequestParam("config") String configPath) throws IOException, InterruptedException { Path tempDir = Paths.get(System.getProperty("java.io.tmpdir")); Path stlFilePath = Files.createTempFile(tempDir, "stl", ".stl"); file.transferTo(stlFilePath.toFile()); Path gcodePath = Files.createTempFile(tempDir, "output", ".gcode"); // Execute PrusaSlicer CLI command String command = String.format("PrusaSlicer --slice --load %s --output %s %s", configPath, gcodePath.toString(), stlFilePath.toString()); Process prusaSlicerProcess = Runtime.getRuntime().exec(command); prusaSlicerProcess.waitFor(); // Return G-code byte[] gcodeBytes = FileUtils.readFileToByteArray(gcodePath.toFile()); Files.delete(stlFilePath); Files.delete(gcodePath); return gcodeBytes; } } Once the G-Code for the .obj file has been returned, it can be passed to the OctoPrint API server for printing. Step 2: G-Code Print Job Is Then Sent to 3D Printer via Octoprint API Server OctoPrint is an application for 3D printers that offers a web interface for printer control. It can be installed on essentially any computer (in the case of my setup, a minimal Raspberry Pi) that is connected to the printer. This can even be done over Wi-Fi, cloud, etc. depending on the printer and setup. However, we will keep it to this basic setup. Again, printers have different applications to provide this functionality, but OctoPrint provides a REST API, which allows for programmatic control, including uploading and printing G-code files. First, an API key must be obtained from OctoPrint’s web interface under Settings > API. Then, the G-code file is uploaded and immediately printed by using a call of this format/content: Shell curl -k -X POST "http://octopi.local/api/files/local" \ -H "X-Api-Key: API_KEY" \ -F "file=@/path/to/file.gcode" \ -F "select=true" \ -F "print=true" XR Viewing and Interaction Step 1: 3D Model Is Exposed as a REST (ORDS) Endpoint Any data stored in the Oracle Database can be exposed as a REST endpoint. To expose the .obj file for download by the XR headset, we can REST-enable the table created earlier using the following: PLSQL CREATE OR REPLACE PROCEDURE download_file(p_id IN NUMBER) IS l_blob BLOB; BEGIN SELECT file_content INTO l_blob FROM file_storage WHERE id = p_id; -- Use ORDS to deliver the BLOB to the client ORDS.enable_download(l_blob); EXCEPTION WHEN NO_DATA_FOUND THEN HTP.p('File not found.'); WHEN OTHERS THEN HTP.p('Error retrieving file.'); END download_file; This makes the file (i.e., the .obj, etc. files) accessible via a simple GET call. Shell curl -X GET "http://thedbserver/ords/theschema/file/file/{id}" -o "3Dmodelwithobjandtextures.zip" Here, we are exposing and downloading the file by id. The XR headset keeps track of the 3D models it has received and polls for the next id each time. From here, the 3D model can be viewed on computers, phones, etc. as-is. However, it is obviously more interactive to view in an actual XR headset, which is what I will describe next. Step 2: XR Headset (Magic Leap 2, Vision Pro, Quest, etc.) Receives the 3D Model From the Oracle Database and Renders It For Viewing and Interaction at Runtime The process for receiving and rendering 3D in Unity as I am showing here (and likewise for UnReal) is the same regardless of the headset used. Interaction with 3D objects (via e.g., hand tracking, eye gaze, voice, etc.) has also been standardized with OpenXR and WebXR which Magic Leap, Meta, and others are compliant with. However, Apple (similar to the case of phone, etc. development) has its own development SDK, ecosystem, etc., and regardless, interaction is not the crux of this blog, so I will only cover the important aspects of the 3D object for viewing. There are a couple of assets on the Unity asset store for doing this conveniently. What is shown below is greatly simplified, but explains the general approach. First, the 3D model is downloaded using a script like this: C# using System.Collections; using UnityEngine; using UnityEngine.Networking; using System.IO; public class DownloadAndSave3DMOdel : MonoBehaviour { private string ordsFileUrl = "http://thedbserver/ords/theschema/file/file/{id}"; private string filePathForTextures; void Start() { filePath = Path.Combine(Application.persistentDataPath, "3Dmodelwithobjandtextures.zip"); StartCoroutine(DownloadFile(ordsFileUrl)); } IEnumerator DownloadFile(string url) { using (UnityWebRequest webRequest = UnityWebRequest.Get(url)) { yield return webRequest.SendWebRequest(); if (webRequest.isNetworkError || webRequest.isHttpError) { Debug.LogError("Error: " + webRequest.error); } else { //Unpack the zip here and process each file for the case where it isObjFile, isMtlFile, or isTextureFile; //The texture/png files are written to a file like this if (isTextureFile) File.WriteAllBytes(filePathForTextures, webRequest.downloadHandler.data); //Whereas the .obj and .mtl files are converted to a stream like this else if (isObjFile){ var memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(webRequest.downloadHandler.text)); processObjFile(memoryStream); } else if (isMtlFile){ var memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(webRequest.downloadHandler.text)); processMtlFile(memoryStream); } } } } } As shown, the the texture/.png files in the zip are saved and the .obj and .mtl files are converted to MemoryStreams for processing/creating the Unity GameObject's MeshRenderer, etc. This parsing is similar to how we created a point cloud table from the .obj in the earlier optional step where we parse the lines of the .obj file; however, a bit more complicated as we also parse the .mtl file (spec explaining these formats can be found here) and apply the textures to create the end resultant 3D model that is rendered for viewing at runtime. This entails this basic logic which will create the GameObject that can then be placed in the headset wearer's FOV or part of a library menu, etc. to interact with. C# // Create the Unity GameObject that will be the main result/holder of our 3D object var gameObject = new GameObject(_name); // add a MeshRenderer and MeshFilter to it var meshRenderer = gameObject.AddComponent<MeshRenderer>(); var meshFilter = gameObject.AddComponent<MeshFilter>(); // Create a Unity Vector object for each of the "v"/Vertices, "vn"/Normals, and "vt"/UVs values parsed from each of the lines in the .obj file new Vector3(x, y, z); // Create a Unity Mesh and add all of the Vertices, Normals, and UVs to it. var mesh = new Mesh(); mesh.SetVertices(vertices); mesh.SetNormals(normals); mesh.SetUVs(0, uvs); // Similarly parse the .mtl file to create the array of Unity Materials. var material = new Material(); material.SetTexture(...); material.SetColor(...); //collect materials into materialArray // Add the mesh and/with materials meshRenderer.sharedMaterials = materialArray; meshFilter.sharedMesh = mesh; Conclusion As you can see, there are many steps to the process; however, hopefully, you have found it interesting to see how it is possible — and will only become easier — to share 3D objects the way we share 2D pictures today. Please let me know if you have any thoughts or questions whatsoever and thank you very much for reading! Source Code The source code for the project can be found here. Video

By Paul Parkinson

Turbocharge Innovation With Automated API Generation: Leveraging Automated API Generation for a Competitive Advantage

Editor's Note: The following is an article written for and published in DZone's 2024 Trend Report, Modern API Management: Connecting Data-Driven Architectures Alongside AI, Automation, and Microservices. In the dynamic and rapidly evolving landscape of cloud-native and SaaS-driven software development, the process of API generation plays a pivotal role in accelerating time to market (TTM) and competitive advantage by rapidly creating APIs. This crucial function now converges with low- and no-code platforms, ushering in new ways of streamlined development processes. At the forefront of this evolution stands generative AI (GenAI), which offers unprecedented speed and flexibility in API creation. In this article, we embark on a journey to explore the transformative potential of generative AI and low- and no-code platforms in API generation by highlighting their roles in fostering innovation and expediting time to market. Additionally, we delve into strategies for effective implementation, compliance considerations, enterprise patterns, and the future trajectory of no-code API generation. Refer to Figure 1 for an illustration of the key components and architecture involved in this synergy: Generative AI – These algorithms autonomously generate code snippets by analyzing large datasets. Low-code/no-code platform – This encompasses the visual interface and pre-built components enabling API design without manual coding. API generation components – These leverage GenAI capabilities to automate API code generation based on user-defined specifications and prompts. API output – The final generated API code that is ready for deployment and integration into software applications. Generative vs. Conventional APIs in the Modern Software Landscape The debate between generative and conventional APIs has intensified, particularly concerning their impact on time to market and skill requirements. Generative APIs — powered by advanced technologies like artificial intelligence (AI) on top of low- and no-code platforms — have gained traction for their ability to automate the API generation process, promising faster TTM and reducing the skill threshold required for development. Conversely, conventional APIs — built through manual coding processes — typically demand a higher level of technical expertise and entail longer development cycles. In contrast, conventional APIs offer flexibility, tailored customization, and, hence, better capabilities for handling complexity. Table 1. Generative vs. conventional APIs Aspect Generative APIs Conventional APIs Time to market Rapid development cycles due to automation Lengthy development timelines Skill requirements Lower skill threshold; accessible to non-coders Require proficient coding skills Handling complex logic May struggle with replicating complex logic accurately Excel in handling complex logic and intricate business rules Integration complexity Simplified integration with existing systems May be challenging and time consuming Maintenance effort Reduced due to automation Require ongoing manual updates and maintenance Innovation potential Enable rapid experimentation and innovation May be limited due to manual coding and testing processes Error handling Automated but may be less customizable Fully customizable Legacy system integration May require additional effort Better compatibility and integration capabilities Creative freedom Limited due to automated processes Flexibility for developers Table 1 highlights the trade-offs between the two approaches, emphasizing the potential to revolutionize API development by expediting TTM and democratizing access to software development. Moreover, by lowering the skill threshold required for API development, these tools empower a broader range of individuals to contribute to the creation of APIs while developers can focus on higher-level tasks. The Synergy of Generative AI and Low and No Code for API Generation The integration of GenAI with low and no code for API generation signifies a significant advancement in API development/generation practices, particularly in terms of efficiency and accessibility. GenAI employs sophisticated large language model (LLM) algorithms that learn to autonomously analyze large datasets on enterprise (an organization's internal context) and open source patterns. When combined with low- and no-code platforms, GenAI enhances these tools by providing developers AI-generated code components that can be easily incorporated into their applications. This integration streamlines the development process, allowing developers to leverage pre-built AI-generated modules to accelerate prototyping and customization. Overall, the integration of GenAI with low- and no-code platforms revolutionizes API development, empowering organizations to innovate rapidly and deliver high-quality solutions to market with unprecedented speed and efficiency. Additionally, this amalgamation profoundly impacts the daily operations and responsibilities of engineers and developers, empowering them to innovate, streamline workflows, and adapt to the evolving demands of modern development practices. Figure 1. Architectural overview of GenAI with low- and no-code platforms for automated API generation Charting the Future: The Trajectory of No-Code API Generation As businesses prioritize agility and efficiency, the rising adoption of automated no-code API generation is set to transform development processes, therefore streamlining workflows and expediting TTM. With no-code platforms now able to accurately generate complex APIs by leveraging GenAI, this trajectory foresees a transition to more intuitive and potent development tools, which, in turn, will empower organizations to innovate swiftly and deliver top-notch solutions with unparalleled speed and efficiency. Democratizing Development As organizations adopt these innovative solutions, the democratization of development gains momentum. No-code API generation platforms empower individuals with diverse technical backgrounds to engage in API creation, promoting collaboration and inclusivity. By lowering entry barriers and reducing reliance on traditional coding skills, these platforms foster a future where API development is accessible and collaborative. This democratization accelerates innovation and ensures that a broader range of voices and perspectives contribute to the development process, resulting in more inclusive and impactful solutions. Compliance and Security In the domain of automated no-code API generation, organizations must prioritize compliance and security to uphold the integrity and reliability of their APIs. Compliance entails adhering to regulatory requirements such as GDPR, HIPAA, and PCI-DSS, as well as industry-specific benchmarks like ISO 27001 or SOC 2. Additionally, robust security measures — including authentication and encryption protocols — are essential for safeguarding against unauthorized access, secret rotations, and cyber threats. Prioritizing compliance and security enables organizations to mitigate risks, protect data, and uphold stakeholder trust. Business Transformation No-code API generation empowers businesses to swiftly adapt to market changes by facilitating rapid API creation and deployment without extensive coding. These tools and platforms enable developers to iterate quickly on ideas, prototypes, and solutions, expediting the development cycle and enabling organizations to respond promptly to market demands. By automating manual coding tasks and streamlining development processes, these tools free up developers' time and resources, allowing them to focus on more strategic tasks such as innovation, problem solving, and optimization. Additionally, these platforms democratize the development process by facilitating collaboration among cross-functional teams with varying levels of technical expertise, fostering scalability and competitiveness. Embracing no-code API generation is essential for driving meaningful transformation and maintaining a competitive edge in today's dynamic digital landscape. Challenges of Automated API Generation While these tools offer streamlined development processes, they come with hurdles that must be addressed. The following chart highlights key technical challenges, providing insights for effective implementation. Table 2. Challenges of automated API generation Challenge Description Handling complex data structures and business logic Handling complex data structures and business logic may pose challenges in integrating diverse data, validating data, and handling versioning effectively. Integration dependencies Automated tools may struggle to integrate seamlessly with existing systems and external APIs due to data format disparities, API versioning conflicts, and limited customization options. Security vulnerabilities Automated processes of low- and no-code results may be prone to potential vulnerabilities such as inadequate access controls, insecure configurations, and absence of data encryption, as well as vulnerabilities within third-party integrations. Limited flexibility Limited flexibility arises from predefined templates or patterns, constraining customization and adaptability to specific project requirements, potentially impacting functionality and scalability. Future proofing Low- and no-code API platforms are abstract in nature and can result in vendor lock-in; adaptability to evolving technologies as well as ensuring long-term support and compatibility can be challenging. Strategies and Guidelines for Generative No-Code API Development Strategies and guidelines provide a roadmap for leveraging AI-driven tools and low- and no-code platforms effectively. These strategies and guidelines encompass comprehensive planning, iterative development, and collaborative approaches, thus ensuring streamlined workflows and accelerated TTM while prioritizing automation, scalability, and security. Table 3. Developing generative no-code APIs Key Aspects Strategies and Guidelines Comprehensive planning Plan thoroughly by defining clear objectives and requirements up front to ensure alignment with business goals and user needs. Iterative development Adopt an iterative development approach, allowing for continuous feedback, testing, and refinement throughout the development process. Collaborative development Foster collaboration between technical and non-technical stakeholders, encouraging cross-functional teams to contribute to API design and development. Embrace automation Leverage automation tools and features provided by no- and low-code platforms to streamline development tasks and increase productivity. Ensure scalability Design APIs with scalability in mind, anticipating future growth and ensuring that the architecture can support increased demand and usage over time. Prioritize security Implement robust security measures to protect APIs from potential threats, including data breaches, unauthorized access, and injection attacks. Testing and validation Implement rigorous testing and validation processes to ensure the reliability, functionality, and interoperability of APIs across different platforms and environments. Conclusion As we cast our gaze toward the future shaped by cloud-native and SaaS-driven development, the integration of generative AI with low- and no-code platforms emerges as a catalyst for innovation. This symbiotic relationship not only revolutionizes API generation but also bestows developers with unprecedented flexibility and efficiency. Embracing automation and innovation will be pivotal in meeting the evolving market demands and expediting TTM. This trend represents more than just a leap in technological prowess; it signals a paradigm shift in the ethos of API development, where the context of creativity and efficiency converge harmoniously. Ultimately, developers and engineers are empowered by automated API generation tools, enabling them to rapidly translate ideas into prototypes and solutions, thus expediting the development cycle. This capability positions engineering and development teams to respond promptly to market demands and feature requirements, fostering experimentation and innovation. By automating manual coding tasks and streamlining development processes, these tools unlock opportunities for organizations to gain a competitive edge by delivering value to customers with unparalleled speed. Despite inevitable challenges, such as compliance and security considerations, the trajectory of automated API generation remains on a path of progress. Embracing strategic guidelines and proactively addressing challenges, businesses can harness the transformative potential of automated API generation to shape the future of software development and technology trends. This is an excerpt from DZone's 2024 Trend Report, Modern API Management: Connecting Data-Driven Architectures Alongside AI, Automation, and Microservices.Read the Free Report

By Pratik Prakash

CORE

Get Some Rest! A Full API Stack

Editor's Note: The following is an article written for and published in DZone's 2024 Trend Report, Modern API Management: Connecting Data-Driven Architectures Alongside AI, Automation, and Microservices. REST APIs have become the standard for communication between applications on the web. Based on simple yet powerful principles, REST APIs offer a standardized yet flexible approach to the design, development, and consumption of programming interfaces. By adopting a client-server architecture and making appropriate use of HTTP methods, REST APIs enable smooth, efficient integration of distributed systems. Becoming a standard, the API ecosystem has grown much richer in recent years and is increasingly integrated into the DevOps ecosystem. It has been infused with agility, CI/CD, and FinOps, and continues to develop by itself. In this article, we're going to compile these new practices and tools to give you a large overview of what an "API approach" can do. API Design and Documentation The API design and documentation stage is crucial, as it defines the basis for all subsequent development. For this reason, it is essential to use methodologies such as domain-driven design (DDD), event storming, and API goals canvas — which represents what, who, how, inputs, outputs, and goals — to understand business needs and identify relevant domains and the objectives of the APIs to be developed. These workshops enable businesses and dev teams to work together and define API objectives and interactions between different business domains. Figure 1. Designing APIs When designing and documenting APIs, it's essential to take into account the fundamental principles of REST APIs. This includes identifying resources and representing them using meaningful URLs, making appropriate use of HTTP methods for CRUD (Create, Read, Update, Delete) operations, and managing resource states in a stateless way. By adopting a resource-oriented approach, development teams can design REST APIs that are intuitive and easy to use for client developers. REST API documentation should highlight available endpoints, supported methods, accepted and returned data formats, and any security or pagination constraints. As such, the REST principles do not exclude the freedom to make a number of choices, such as the choice of naming conventions. For a compilation of these best practices, you can read my last Refcard, API Integration Patterns. In this stage, API style books play a crucial role. It gives design guidelines and standards to ensure the consistency and quality of the APIs developed. These style books define rules on aspects such as URI structure, HTTP methods to be used, data formats, error handling, and so on. They serve as a common reference for all teams working on API projects within the organization. Stoplight and SwaggerHub are commonly used, but a simple Wiki tool could be enough. Data model libraries complete this phase by providing reusable data models, which define the standard data structures used in APIs. Data model libraries include JSON schemas, database definitions, object models, and more. They facilitate development by providing ready-to-use assets, reducing errors, and speeding up development. Commonly used tools include Apicurio and Stoplight. A workflow API's description is often missing from the APIs we discover on API portals. Questions arise such as: How do I chain API calls? How do I describe the sequence of calls? With a drawing? With text in the API description? How do I make it readable and regularly updated by the person who knows the API best (i.e., the developer)? It could still be a pain to understand the sequence of API calls. However, this is often covered by the additional documentation that can be provided on an API portal. Yet at the same time, this was decoupled from the code supplied by the developers. The OpenAPI Specification allows you to define links and callbacks, but it is quickly limited to explaining things properly. This is why the OpenAPI Workflows Specification has recently appeared, allowing API workflows to be defined. In this specification, the steps are always described in JSON, which, in turn, allows a schema to be generated to explain the sequence of calls. If you want to describe your workflows from OpenAPI specifications, you can use Swagger Editor or SwaggerHub. And you can use Swagger to UML or openapi-to-plantuml. If you want to begin by designing sequence diagrams, you can use PlantUml or LucidChart, for instance. There is no unique toolchain that fits all needs; you have to first know if you prefer a top-down or bottom-up approach. Tools such as Stoplight Studio combined with Redocly are commonly known for handling these topics — Apicurio as well. They can be used for API design, enabling teams to easily create and visualize OpenAPI specifications using a user-friendly interface. These specifications can then be used to automatically generate interactive documentation, ensuring that documentation is always up to date and consistent with API specifications. API Development Once the API specifications have been defined, the next step is to develop the APIs following the guidelines and models established during the design phase. Agile software development methods, effective collaboration, and version management are must-have practices to ensure good and fast developments. Figure 2. Building APIs For versioning, teams use version control systems such as Git or GitHub to manage API source code. Version control enables seamless collaboration between developers and ensures full traceability of API changes over time. During development, the quality of the API specification can be checked using linting tools. These tools can check: Syntax and structure of the code Compliance with coding standards and naming conventions Correct use of libraries and frameworks Presence of dead or redundant code Potential security problems Swagger-Lint and Apicurio Studio or Stoplight can be used to carry out these and other linting checks, but these checks can be made into a CI/CD toolchain (more info to come in the API Lifecycle Management section). Automation plays a crucial role in this stage, enabling unit, security, and load tests to run seamlessly throughout the development process. Postman and Newman are often used to automate API testing to ensure quality and security requirements, but other solutions exist like REST Assured, Karate Labs, and K6. Development frameworks supporting API REST development are very common, and the most popular ones include Express.js with Node.js, Spring Boot, and Meteor. Most of the popular frameworks support HTTP, so it should not be a complicated action to choose. API capacities are a must when you choose a framework, but they are not the only ones. Developers will build your stack, so you'll need frameworks that are both appreciated by your devs and relevant to other technical challenges you'll have to tackle. And we have to speak about mock prototyping! It's something that could unlock developers' inter-dependency to propose a Mock API whenever you target internal or external developers. This is generally based on the OpenAPI description of your API and is often taken into account by API management portals. There are also dedicated OSS projects such as MockServer or WireMock. API Security API security is a major concern in API development and management. It is essential to implement authentication, authorization, and data encryption mechanisms to protect APIs against attacks and privacy violations. API keys, OAuth 2.0, and OpenID Connect are the three protocols to know: API keys are still widely used for API access due to their ease and low overhead. They are a unique set of characters sent as a pair, a user, and a secret, and should be stored securely like passwords. OAuth 2.0 is a token-based authentication method involving three actors: the user, the integrating application (typically your API gateway), and the target application. The user grants the application access to the service provider through an exchange of tokens via the OAuth endpoint. OAuth is preferred for its granular access control and time-based limits. OpenID Connect is a standardization of OAuth 2.0 that adds normalized third-party identification and user identity. It is recommended for fine-grained authorization controls and managing multiple identity providers, though not all API providers require it. In addition to that, solutions such as Keycloak can be deployed to provide centralized management of identity and API access. Alternatives of Keycloak include OAuth2 Proxy, Gluu Server, WSO2 Identity Server, and Apache Syncope. But just talking about tools and protocols would not be enough to cover API security. Contrary to what we sometimes read, a front-end web application firewall (WAF) implementing the OWASP rules will prevent many problems. And what certainly requires a dedicated DZone Refcard, like Getting Started With DevSecOps, a comprehensive DevSecOps approach will greatly reduce the risks. However, automated security testing is also essential to guarantee API robustness against attacks. OSS tools such as ZAP can be used to tackle automated security tests, identifying potential vulnerabilities in APIs and enabling them to be corrected before they can be exploited by attackers. API Lifecycle Management Once APIs have been developed, they need to be deployed and managed efficiently throughout their lifecycle. This involves version management, deployment management, performance monitoring, and ensuring API availability and reliability. API management platforms include, but are not limited to, Gravitee, Tyk, WSO2 API Manager, Google Cloud Apigee, and Amazon API Gateway are used for API deployment, version management, and monitoring. These platforms offer advanced features such as caching, rate limiting, API security, and quota management. Clearly, these are must-haves if you want to be able to scale. Figure 3. Running APIs To ensure compliance with standards and guidelines established during the design phase, tools such as Stoplight's Spectral are used to perform a linting analysis of OpenAPI specifications, identifying potential problems and ensuring API consistency with design standards. And of course, at the end of the chain, you need to document your API. Tools exist to automate many tasks, such as Redocly, which generates interactive documentation from the OpenAPI Specification. The added benefit is that you ensure that your documentation is constantly up to date and always simple and readable for everyone, developers and business analysts alike. API management also involves continuous monitoring of API performance, availability, and security, as well as the timely implementation of patches and updates to ensure their smooth operation. API Analysis and Monitoring Analysis and monitoring of APIs are essential to ensure API performance, reliability, and availability. It is important to monitor API performance in real time, collect data on API usage, and detect potential problems early. The ELK Stack (Elasticsearch, Logstash, Kibana) is often used to collect, store, and analyze API access logs for monitoring performance and detecting errors. OpenTelemetry is also used in many use cases and is a must-have if you want to monitor end-to-end processes, especially ones that include an API. Regarding API performance metrics, Prometheus and Grafana are commonly used in real time, giving much information on usage trends, bottlenecks, and performance problems. FinOps and Run Management Finally, once APIs are deployed and running, it's important to optimize operating costs and monitor cloud infrastructure expenses. FinOps aims to optimize infrastructure costs by adopting practices such as resource optimization, cost forecasting, and budget management. Cloud cost monitoring tools such as AWS Cost Explorer, Google Cloud Billing, and Azure Cost Management are used to track and manage cloud infrastructure spend, keeping operating costs under control and optimizing API spend. However, in a hybrid cloud world, we could consider open-source solutions like Cloud Custodian, OpenCost, and CloudCheckr. Conclusion Obviously, you don't need to put all this in place right away to start your API journey. You have to first think about how you want to work and what your priorities are. Maybe you should prioritize design tools, like linting tools, or define your API style book and API design tool. Of course, prioritize tools that are commonly used — there's no wheel to reinvent! In fact, I'd say implement everything that is at the beginning of this toolchain because it will be easier to change after. I hope all these points in mind will enable you to get started serenely while prioritizing your own API needs. This is an excerpt from DZone's 2024 Trend Report, Modern API Management: Connecting Data-Driven Architectures Alongside AI, Automation, and Microservices.Read the Free Report

By Thomas Jardinet

CORE

Understanding API Technologies: A Comparative Analysis of REST, GraphQL, and Asynchronous APIs

Editor's Note: The following is an article written for and published in DZone's 2024 Trend Report, Modern API Management: Connecting Data-Driven Architectures Alongside AI, Automation, and Microservices. APIs play a pivotal role in the world of modern software development. Multiple types of APIs can be used to establish communication and data exchange between various systems. At the forefront lies the REST approach, which has dominated the industry due to its simplicity and scalability. However, as technology has evolved, the demands of developers and businesses have also changed. In recent years, alternatives such as GraphQL and asynchronous event-driven APIs have also emerged. They offer distinct advantages over traditional REST APIs. In this article, we will look into each of these API technologies and build a comparative understanding of them. REST: The Start of Resource-Oriented Communication REST architecture revolves around the concept of resources. These are entities that can be managed through standard HTTP methods such as GET, POST, PUT, and DELETE. One of the key characteristics of REST is its stateless nature, where each request from a client contains all the necessary information for the server to fulfill it. This decouples the client and server, allowing them to be scaled independently. Advantages and Disadvantages of REST REST APIs have some significant advantages: REST follows a simple and intuitive design based on standard HTTP methods. Each request in the REST approach is independent, resulting in better scalability and reliability. REST utilizes HTTP caching mechanisms to enhance performance and reduce the load on the origin server. REST is interoperable, working well with various programming languages and platforms due to its standard format. However, REST architecture also has several disadvantages: REST APIs can result in overfetching, where clients receive more data than needed, leading to inefficiency and waste of network bandwidth. Similar to the first point, REST APIs can also suffer from underfetching, where multiple requests are needed to fulfill complex data requirements. This results in increased latency. REST follows a synchronous approach that can lead to blocking and performance issues in high-load scenarios. Changes to the API's data schema can impact clients, resulting in tight coupling. Use Cases of REST APIs There are ideal use cases where REST APIs are much better suited when compared to other types of APIs, for example: Caching intensive applications – A read-heavy application, such as news websites or static content, can benefit from REST's caching mechanism. The standardized caching directives of REST make it easier to implement. Simple CRUD operations – When dealing with straightforward CRUD operations, REST APIs offer simplicity and predictability. Applications with a clear and static data model often find REST APIs to be more suitable. GraphQL: The Rise of Declarative Data Fetching With APIs GraphQL is a combination of an open-source language for querying data as well as a runtime for fulfilling those queries. The key principle behind GraphQL is to have a hierarchical structure for defining data queries, letting the clients precisely specify the data they need in a single request. Figure 1. GraphQL in the big picture In quite a few ways, GraphQL was a direct response to the issues with the traditional REST API architecture. However, it also promotes a strongly typed schema, offering developers a clear idea of what to expect. GraphQL supports real-time data updates through subscriptions. Over the years, a lot of work has happened on tools like GraphQL Federation to make GraphQL APIs more scalable for large enterprises with multiple domain areas. Advantages and Disadvantages of GraphQL GraphQL provides some key advantages: With GraphQL, clients can request only the specific data they need. This eliminates the overfetching and underfetching issues with REST APIs. GraphQL's strongly typed schema approach provides a clear structure and validation, speeding up development and documentation. GraphQL typically operates through a single endpoint. Clients just need to care about a single endpoint while talking to a GraphQL server even though there might be multiple sources for the data. Built-in introspection allows clients to explore the schema and discover available data and operations. There are also several disadvantages to GraphQL: Implementing GraphQL requires additional effort and expertise when compared to traditional REST APIs. Since the queries in GraphQL are flexible, caching of data can be challenging and may need custom solutions. While GraphQL reduces overfetching at the top level, nested queries can still lead to unnecessary data retrievals. Ownership of the common GraphQL layer becomes confusing, unlike the clear boundaries of a REST API. Use Cases of GraphQL There are specific scenarios where GraphQL does a better job as compared to REST APIs, for instance: Complex and nested data requirements – To fetch data with complex relationships, GraphQL helps clients precisely specify the data they need in a single query. Real-time data updates – GraphQL subscriptions help applications handle real-time data updates such as chat applications or live dashboards. With GraphQL, clients can subscribe to changes in specific data, allowing real-time updates without the need for frequent polling. Microservices architectures – In this case, data is distributed across multiple services. GraphQL provides a unified interface for clients to query data from various services. The client application doesn't have to manage multiple REST endpoints. Asynchronous APIs: A Shift to Event-Driven Architecture Over the years, the push to adopt, or migrate to, a cloud-native architecture has also given rise to event-driven architectures, the advantage being the prospect of non-blocking communication between components. With asynchronous APIs, clients don't need to wait for a response before proceeding further. They can send requests and continue their execution process. Such an approach is advantageous for scenarios that require high concurrency, scalability, and responsiveness. In event-driven systems, asynchronous APIs handle events and messages along with help from technologies like Apache Kafka and RabbitMQ, which offer a medium of communication between the message producer and the consumer. Considering a typical system using an event-driven API approach, we have producers publish events to topics, and consumers subscribe to these topics to receive and process the events asynchronously. This allows for seamless scalability and fault tolerance because both producers and consumers can evolve independently. The below diagram shows such a system: Figure 2. An event-driven system with Kafka and asynchronous APIs Advantages and Disadvantages of Asynchronous APIs There are some key advantages of asynchronous APIs: Asynchronous APIs are well suited for handling high concurrency and scalability requirements since multiple requests can be handled concurrently. Asynchronous APIs also enable real-time data processing by enabling timely response to events. Asynchronous APIs can also help better utilize system resources by offloading tasks to background processes. Lastly, asynchronous APIs increase the general fault tolerance of a system as one component failing doesn't disrupt the entire system. However, just like other API types, asynchronous APIs also have several disadvantages: There is increased complexity around message delivery, ordering, and error handling. Asynchronous APIs are more challenging to debug and test. Systems built using asynchronous APIs often result in eventual consistency, where data updates aren't immediately reflected across all components. Asynchronous APIs can also increase costs with regard to special systems for handling messages. Use Cases of Asynchronous APIs There are a few ideal use cases for asynchronous APIs when compared to REST and GraphQL APIs, including: Real-time data streaming – Asynchronous APIs are the best choice for real-time data streaming needs such as social media feeds, financial market updates, and IoT sensor data. These applications generate large volumes of data that need to be processed and delivered to clients in near real time. Integration with third-party systems – Asynchronous APIs are quite suitable for integrating with third-party systems that may have unpredictable response times or availability SLAs. Background tasks – Lastly, applications that require execution of background tasks — such as sending emails, notifications, or image/video processing — can benefit from the use of asynchronous APIs. Side-by-Side Comparison of REST, GraphQL, and Asynchronous APIs We've looked at all three types of API architectures. It is time to compare them side by side so that we can make better decisions about choosing one over the other. The table below shows this comparison across multiple parameters: Table 1. Comparing REST, GraphQL, and Async APIs Parameter REST APIs GraphQL APIs Asynchronous APIs Data fetching approach Data is fetched with predefined endpoints Clients specify the exact data requirements in the query Data is passed in the form of asynchronous messages Performance and scalability Highly suitable for scalable applications; can suffer from overfetching and underfetching problems Scalable; nested queries can be problematic Highly scalable; efficient for real-time data processing Flexibility and ease of use Limited flexibility in querying data High flexibility for querying data Limited flexibility in querying data and requires understanding of an event-driven approach Developer experience and learning curve Well established and familiar to many developers Moderate learning curve in terms of understanding the GraphQL syntax Steeper learning curve Real-time capabilities Limited real-time capabilities, relying on techniques like polling and webhooks for updates Real-time capabilities through subscriptions Designed for real-time data processing; highly suitable for streaming applications Tooling and ecosystem support Abundant tooling and ecosystem support Growing ecosystem The need for specialized tools such as messaging platforms like RabbitMQ or Kafka Conclusion In this article, we've explored the key distinctions between different API architectures: REST, GraphQL, and asynchronous APIs. We've also looked at scenarios where a particular type of API may be more suitable than others. Looking ahead, the API development landscape is poised for further transformation. Emerging technologies such as machine learning, edge computing, and IoT will drive new demands that necessitate the evolution of API approaches. Also, with the rapid growth of distributed systems, APIs will play a key role in enabling communication. As a developer, it's extremely important to understand the strengths and limitations of each API style and to select the approach that's most suitable for a given requirement. This mentality can help developers navigate the API landscape with confidence. This is an excerpt from DZone's 2024 Trend Report, Modern API Management: Connecting Data-Driven Architectures Alongside AI, Automation, and Microservices.Read the Free Report

By Saurabh Dashora

CORE