Posts Tagged ‘technology’

I was recently listening to a Nanoservices? Miniservices? Macroservices? podcast from the NoFluffJustStuff team on different service styles and I learned about terminology to describe a style of structuring services that I had recently been advocating. Miniservices describe a style of breaking a system into services that is more  balanced than Microservices.

Microservices Recap

It is worth reviewing some key characteristics of a microservice.

  • A microservice is built around a bounded context. A bounded context is described in the language of the business and constitutes the boundary or ‘job’ of a service.   Typical examples of bounded contexts might be “Billing” or “Maintaining Customer Information” or “The Orders in the system” however in practice the context boundaries tend to be much smaller and refined to keep the services small.  A bounded context is specifically not a technical boundary such as “Frontend” versus “Backend”.  A microservice should be concerned with a single bounded context and have a single responsibility.
  • A Microservice has  its own data-store (ie database).  The data-store belongs to the microservice and the data-store is not exposed outside of the microservice.  Multiple microservices do not share data-stores.
  • Microservices do not share code with other microservices.  Two microservices can use the same library but this library should be an independent project with its own release cycle and commitment to stable public API’s. If two microservices need the same business rules then those business rules should be belong to one bounded context (possibly a new microservice) and the other services should invoke those business rules through a stable service interface.
  • Each microservice can evolve and be deployed independently of other microservices.  This is allows each micoservice team to move fast and make changes without having to coordinate with other microservices.  Changes to the public API are typically handled by versioning the public API and supporting the old versions through a deprecation period

The ability to deploy and evolve a microservice independently is one of the key goals of a microservice architecture.  The desire to have an organizational structure consisting of a bunch of small teams that can have short releases and deploy independent of other teams is what leads to many of the other characteristics on the list.  Sharing code and databases across services and teams means that those services and teams now need to coordinate any changes to these shared blocks.

Microservice Drawbacks

There are some costs and downsides to this approach to software development resulting from the trade-offs made to optimize for small independent services.

  • Organizations that would have previously had one or two larger “applications’ to handle a collection of business functionality might now have dozens of microservices to do the same thing.   Each of these microservices needs to be maintained and updated for security deprecated API’s.
  • Code duplication becomes much more common because code sharing isn’t allowed. If some code is useful to two different services and that code isn’t appropriate to be broken out into its own service (or maybe the schedule doesn’t allow for this) then teams will typically just copy the code into the other service’s source tree. Twenty years ago this was considered a big anti-pattern.  Code duplication adds long term maintenance costs.
  • The performance of an individual user-focused business operation can start to grow when the operation might need to contact a dozen other services to complete the workflow.  There are techniques to mitigate this but they add complexity.
  • Coordinating integration tests of the entire architecture becomes very difficult because there are so many services and teams involved.  Some organizationsend up doing a lot of testing in production (ie A/B testing or running a parallel instance with mirrored production traffic).

Any individual microservice might be simple and not-complex but the architecture overall with dozens to hundreds of microservices tends to be very complex.  The trade-offs made to allow small independent teams to deploy quickly isn’t the best goal for all organizations.

Introducing Miniservices

Organizations that aren’t as focused on small independent teams with fast deployment cycles might instead be focused on efficiently evolving a software system as the business changes.

  • Business functionality is grouped into related groupings but the strictness of the “Single Responsibility Service” is no longer a driving principal. I feel that a “Bounded Context” probably would be a good word to describe a “Collection of related business functions” but this means something different than how the word is typically used in a pure microservice architecture.  Multiple services will live inside a group of related business functionality.  I will call this a “Business Context” but it is possible others have come up with a better term for this. A “Business Context” is usually larger than than the “Bounded Context” that that define a single microservice.
  • The multiple related miniservices in a “business context” are allowed to share the same data-store (ie database). Not allowing two microservices to share a single database is one of the most expensive aspects of microservices.  This makes many types of common business results difficult to produce because they have to combine data from multiple services often at the cost of increased complexity. Allowing multiple related miniservices to share a database can significantly reduce the implementation costs for a lot of architectures.  There will often be multiple databases within a “business context” and some databases will be more closely related to(or mostly used by) some services but there is no hard-rule preventing multiple miniservices from using the database.  These decisions should be made on an individual basis while keeping efficiency in mind.
  • The data-store shared between multiple miniservices in the “business context” should have a well defined schema.  This schema can change but it needs to be well defined at any point in time and the data-store should enforce this.  NOSQL systems that don’t enforce a formal schema can work when a single service is the only user of that data but any data that will be used by multiple miniservices needs an enforced schema.
  • The multiple related miniservices are allowed to share code typically through libraries (ie JARS). Developers should easily be able to move code into a library that can allow the code to be shared with the related miniservices.
  • The teams responsible for the group of related miniservices should be related to each other on the organizational chart.  Ideally the same team should maintain the collection of “related miniservices”.  This allows for easier coordination of changes shared database schemas and shared libraries. Different people or subteams can be dedicated to some of the miniservices but it is important that they coordinate and work together when required
  • Each miniservice within a “business context” can still be modified and deployed on independent schedules.  The miniservices are still standalone deployable services.  Sometimes the changes and deployment of two related miniservices will need to be coordinated and possibly combined.  This is both common and acceptable in a miniservice architecture.  We rely on the team to figure out these dependencies and coordinate them.
  • All of the related miniservices in a  “Business Context” must be tested together as part of a pre-deployment integration test.  This is needed because it is possible that a change might impact a service without anyone realizing it.
  • Multiple related miniservices can be deployed on a single virtual machine or you can adopt an approach with docker to give each miniservice its own container.  A miniservice architecture is flexible enough to accommodate both.
  • An individual “release” will consist of one or more miniservices from the set of related services being deployed together.

Miniservices and distantly related services

The above section describes how you can break a bunch of related business functionality into miniservices and have them share database.  Some business functionality and the related services will end up being part of a different “business context”.   These services (and their associated business rules) will usually be maintained by a different team often living on a different part of the organizational chart.   The sharing of data-stores and code between these business context is discouraged.   Two miniservices from different “Business contexts” should probably not be writing data into the same data-store. Allowing a miniservice from a different context to pull(or query) data from the data-store belonging to another context might sometimes be done for practical or efficiency reasons,   This is discouraged but not outright forbidden.

 

 

 

 

 

 

 

The @Bean annotation is a great way to declare Spring beans when writing code.  This allows the programmer to control which beans get instantiated and with what arguments.   Some situations call for deciding if beans should be instantiated at runtime based on configuration this is often done with the @Conditional annotation on @Configuration classes.   The @Conditional annotation works well if you are enabling or disabling a particular bean through configuration but sometimes the number of beans needs to be completely dynamic.

This often comes up if your application needs to connect to a bunch of external services such as a database.  You want the number of  services or databases that the application connects to be configurable.

(more…)

Spring integration makes it easy to monitor an sftp server for new files and inject those files into your application for processing.  I like to configure my spring integration applications with annotations and @Bean configurations.  I found the documentation, and online examples for doing this annotations a bit lacking.

 

This post demonstrates the basics on how to connect get a spring-boot (1.3.x or 1.4.x) application to monitor a sftp server.

(more…)

Sometimes the behaviour of an application is controlled through properties
and the application needs to detect changes to the property file so it can switch to the new configuration. You also want to ensure that a particular request either uses the old configuration or the new configuration but not a mixture of old and new. Think of this as ACID like isolation for properties to ensure that your requests don’t get processed using an inconsistent configuration.

We accomplish this by using a property lookup bean (PropertyBean) that returns property lookups. The PropertyLookupBean is a request scoped bean meaning that a new instance is created for each request in a spring MVC application. The PropertyLookup bean has an init method that runs post construction and calls the PropertyCache bean to get the current set of properties. Each instance of the PropertyLookup bean will return the property values from the Properties object it gets at initialization.

properties

The property lookup bean has a reference to a singleton scoped PropertyCache. The PropertyCache maintains a reference to the current Properties object. When the PropertyCache bean is asked to return the current properties it checks to see if they have changed on disk and if so reloads the properties.

class PropertyBean {
    private Properties currentProperties;

    @Autowired
    PropertyCache propertyCache;

    @PostConstruct
    void doInit() {
        currentProperties = propertyCache.getProperties();
    }
    public String getProperty(String propertyName) {
        return currentProperties.getProperty(propertyName);
    }
}

 

The PropertyBean would look similar to the above sample class. This class simply caches an instance of a Properties object post-construction and delegates property requests to this instance.

The PropertyBean needs to be created as a request scoped bean. This can be done with a Spring annotation configuration class as follows.

@Configuration
class MyConfig {
    @Bean
    @Scope(value="request", proxyMode=ScopedProxyMode.TARGET_CLASS)
    public PropertyBean propertyBean() {
        return new PropertyBean();
    }
}

 

Managing the loading of properties falls to the PropertyCache class which would look similar to the below class

@Component
public class PropertyCache {
    private Properties currentProperties;
    private Long lastReloadTime=0;
    @Value("${dynamic.property.file.path")
    private String propertyFileName;


    public Properties getProperties()  {
         reloadIfRequiredProperties();   
         synchronized(this) {
             return currentProperties;
         }
    }

    private reloadPropertiesIfRequired() {
         FileInputStream stream = null;
         try {
             File f = new File(propertyFileName);
             if(f.lastModified() > lastReloadTime) {
                 Properties newProperties = new Properties();
                 stream = new FileInputStream(f);
                 newProperties.load(stream);
                 synchronized(this) {
                     lastReloadTime = System.getCurrentMillis();
                     currentProperties = newProperties;
                 }
             } 
          } 
    }
    catch(IOException exp) {
        //do something intelligent on the error
    }
    finally {
        if (stream != null)
            stream.close();
    }
}

 

All of the business logic code that needs properties to process a request will use the request scoped PropertyBean instance bean. This ensures that the entire request uses the same set of properties even if the underlying property file changes.

class SomeBusinessService {
    @Autowired
    private PropertyBean propertyBean;

   public void someOperation() {
      String v = propertyBean.getProperty("some.property");
       .
       .
    }

}

When a new request is started a new instance of the PropertyBean is created. The doInit() method invokes the PropertyCache getProperties() method. The Property Cache checks the timestamp of the file and looks for modifications, if it detects one it reloads the property file. Any PropertyBean instances that have already been initialized will continue to use the old properties instance.

Tonight I presented a talk on using JSON in Postgres at the Toronto Postgres users group. Pivotal hosted the talk at their lovely downtown Toronto office. Turnout was good with a little over 15 people attending (not including the construction workers banging against some nearby windows).

I talked about the JSON and JSONB datatypes in Postgres and some idea for appropriate uses of NoSQL features in a SQL database like Postgres.

My slides are available for download

We are thinking of having lighting and ignite talks for the next meetup. If anyone is in the Toronto area and wants to give a short (5 minute) talk on a Postgres related topic let me know.

I was recently working on a project where we had about half a dozen developers working on an established code base. All of the developers were new to the code base and I knew that we were going to be making a fair number of database schema and data-seeding changes in a short period of time. Each developer had their own development environment with a dedicated database (PostgreSQL). The developers on the project had their hands full learning about the code base and I didn’t want to distract them by having to take a lot of their time managing their development database instances.

I decided to try using Alembic to manage the database schema migrations.
(more…)

Last week I was in Chicago giving a talk at PostgresOpen on managing PostgreSQL with puppet. The talk was well attended and appears to have been well received.

Puppet is configuration management software that allows you to describe how your servers should look using a declarative syntax. You describe what packages you want to install (obviously postgres) and how your configuration files should look. Puppet also allows you to run commands to create databases or database objects such as users.

In my talk I discuss why it is important to use a repeatable procedure for building production database servers and how this is a tool in bridging the divide between developers and operations staff.

I talk about how deploying servers with automation allows your servers to be similar. Similar might not mean identical but the differences between your database servers is controlled and managed. This also applies to your development and QA servers. If you deploy your staging, QA, and development servers using the same puppet manifest as your production servers but with possibly different configuration options then you will be more confident in your testing.

You can view my slides. They recorded the talk and I will update this post with a link to the talk when it is posted.

Updated: You can view a recording of the video below

Last week I was hanging out on the top of San Francisco for puppetconf 2013. PuppetConf is the annual conference dedicated to the puppet configuration management system. Some of the lessons I learned are worth sharing.

1. Walking to the top of Nob Hill is less fun that it looks

After landing at SFO I went from the airport to the BART station. Inside the BART station, at the vending machine, I encountered a pair having difficulty figuring out how to buy a ticket. Having been in San Francisco a few months ago forSOTM-US I considered myself competent at operating the BART vending machine so I helped them put money on a ticket. I then rode the train to Powell station. My hotel was at the top of Nob Hill and the cable-car line seemed long so I decided to walk. Walking up a big steep hill with a laptop and luggage on a warmish summer day is a lot of work.

2. Puppet(conf) is more popular than PosgreSQL(conf).

I have been to a lot of PostgreSQL (and other open-source) conferences over the past 7 years and puppetconf with close to 1200 attendees was larger than FOSS4G 2010 and probably 4 times larger than most of the other conferences I’ve recently attended. Puppet and DevOps are hot stuff these days and all the cool kids appear to be doing it. I spoke to numerous people from old school industries like retail and utilities along with people from technology industry giants like salesforce.com, and three letter I.T vendors. They are all starting to use puppet to help with managing their I.T configuration. The main competitor to puppet seems is Chef, another open-source configuration management solution but I didn’t talk to a single person at the conference who could point to a closed source alternative with comparable features that they were using instead of puppet. Puppet is replacing hundreds of poorly written, and misunderstood shell scripts.

3. Culture Is A Big Deal
An entire conference track along with a keynote was dedicated to the DevOps Culture. No one is really sure what DevOps means but it appears to be almost as vague as cloud and getting almost as much hype. The reason why DevOps and culture is so closely linked to puppet and configuration management is because puppet is a tool to help bridge the cultural divides in your organization. Communicating between people and teams with spoken/written languages such as English is in-precise but communicating how things should be configured though a declarative and executable syntax like puppet manifests is very precise. A puppet manifest puts the developers and operations staff on the same page and can help ensure what QA is testing is actually what will be deployed. Automated configuration management is also really helpful if you want to do frequent deployments into test and production environments.

4. Orchestration is next
Puppet deals with configuring a single server and makes it easy for many servers to be configured in the same way. Often you need to configure multiple servers at the same time in a related way. For example if you are deploying a new version of your software you might need to make some database changes, remove a web-server from the load-balancer, apply the software update to the web-server then add it back into the load balancer pool and repeat for all of your web-servers. Once all of your web-servers have been updated you might need to make another set of database schema changes. Conducting the changes on each of the servers in the right order is called orchestration. Puppetlabs has software called mcollective (and PuppetDB) to help with this but it mostly deals with the RCP aspects of the problem and doesn’t (at least not yet) provide facilities for dealing with errors on one of the servers and rolling back or tools to help with managing dependencies between servers. VMWare claims to have products (vDirector and vSphere Orchastrator) that might do better at this but I didn’t make it to a presentation on them. I expect to see more tooling in this area in the next year.

I will be giving a talk at the PGOpen Conference in Chicago later this month on how Puppet is used to manage PostgreSQL databases that make up key parts of the domain name registry system. Come to PGOpen so you can learn more.

This weekend we had the second annual Toronto OpenStreetMap developer weekend. The nice folks at the Ryerson Department of Geography hosted us. My focus this weekend was to work the Serge and Martijn on maproulette

Maproulette is software that presents an easy to do mapping task to users which they can complete and then mark the task as completed. Examples of past maproulette mapping challenges include fixing connectivity errors or fixing objects touched by the license change.
(more…)

I spent the weekend attending Pycon Canada where I gave a talk on Pl/Python. I want to thank the conference organizers for putting on an excellent conference. I am told that this was the first time Pycon had a regional conference in Canada and that it was put together by a group of volunteers in less than 6 months.

One of my favourite parts of local/regional conferences held on weekends is that they tend to attract attendees who are passionate about computers and technology. The people who I spoke with at the conference were there because they wanted to be there,not because there boss wanted them to be there, and either loved Python or wanted to learn more about it. I’ve attended many great PostgreSQL conferences over the past few years but it was nice to spend sometime talking with people from broader development backgrounds.

In my discussions with people at the conference I noticed a trend. People I spoke with who are working at companies that did Python development tended to be using PostgreSQL. The ones that weren’t currently using PostgreSQL were using MySQL and talking about moving to PostgreSQL or were apologetic for still being on MySQL. The MySQL users were often apologizing before I told them that I was a PostgreSQL contributor. Some of the MySQL users also mentioned that they were using non-Oracle forks like Percona.

This was in contrast to the people at the Python conference that described their workplaces as doing primarily Java development. The Java development shops tended to be using Oracle or SQL Server. I admit that the sample size of of the Java developers wasn’t that big (this was a Python conference after all) but my observations are worth keeping in mind since they might be indicating a pattern. Other people have commented about the popularity of PostgreSQL in the Ruby community.

I wonder how much of this observations is because older written in Java are already using SQL Server/Oracle and there hasn’t been a strong enough driver to change to PostgreSQL. While newer software projects are tending to choose Python or Ruby over Java and at the same time picking a FLOSS database such as PostgreSQL where they don’t have to worry about migrating a legacy application.

My talk on writing stored functions in Pl/Python was well received. A lot of people saw appeal in being able to write their stored functions in Python instead of pl/SQL but that shouldn’t be a surprise considering this was a Python conference.

My slides are available here, the video of the talk is posted at pyvideo