Posts Tagged ‘software’

The maven-site-plugin allows for documentation websites to be created for maven projects.  Developers write their documentation in a variety of documentation formats such as markdown and the plugin translates this to HTML.

The maven site plugin supports generating documentation based on Velocity templates.  This allows for the actual contents for the documentation to be generated at build time based on a template. It is sometimes handy to use Json data in this template. This allows for your site to render data based on Json files in your project that might be used for other things.  This is supported by the maven-site-plugin but getting this to work requires enabling the velocity JsonTool module.

(more…)

I was recently listening to a Nanoservices? Miniservices? Macroservices? podcast from the NoFluffJustStuff team on different service styles and I learned about terminology to describe a style of structuring services that I had recently been advocating. Miniservices describe a style of breaking a system into services that is more  balanced than Microservices.

Microservices Recap

It is worth reviewing some key characteristics of a microservice.

  • A microservice is built around a bounded context. A bounded context is described in the language of the business and constitutes the boundary or ‘job’ of a service.   Typical examples of bounded contexts might be “Billing” or “Maintaining Customer Information” or “The Orders in the system” however in practice the context boundaries tend to be much smaller and refined to keep the services small.  A bounded context is specifically not a technical boundary such as “Frontend” versus “Backend”.  A microservice should be concerned with a single bounded context and have a single responsibility.
  • A Microservice has  its own data-store (ie database).  The data-store belongs to the microservice and the data-store is not exposed outside of the microservice.  Multiple microservices do not share data-stores.
  • Microservices do not share code with other microservices.  Two microservices can use the same library but this library should be an independent project with its own release cycle and commitment to stable public API’s. If two microservices need the same business rules then those business rules should be belong to one bounded context (possibly a new microservice) and the other services should invoke those business rules through a stable service interface.
  • Each microservice can evolve and be deployed independently of other microservices.  This is allows each micoservice team to move fast and make changes without having to coordinate with other microservices.  Changes to the public API are typically handled by versioning the public API and supporting the old versions through a deprecation period

The ability to deploy and evolve a microservice independently is one of the key goals of a microservice architecture.  The desire to have an organizational structure consisting of a bunch of small teams that can have short releases and deploy independent of other teams is what leads to many of the other characteristics on the list.  Sharing code and databases across services and teams means that those services and teams now need to coordinate any changes to these shared blocks.

Microservice Drawbacks

There are some costs and downsides to this approach to software development resulting from the trade-offs made to optimize for small independent services.

  • Organizations that would have previously had one or two larger “applications’ to handle a collection of business functionality might now have dozens of microservices to do the same thing.   Each of these microservices needs to be maintained and updated for security deprecated API’s.
  • Code duplication becomes much more common because code sharing isn’t allowed. If some code is useful to two different services and that code isn’t appropriate to be broken out into its own service (or maybe the schedule doesn’t allow for this) then teams will typically just copy the code into the other service’s source tree. Twenty years ago this was considered a big anti-pattern.  Code duplication adds long term maintenance costs.
  • The performance of an individual user-focused business operation can start to grow when the operation might need to contact a dozen other services to complete the workflow.  There are techniques to mitigate this but they add complexity.
  • Coordinating integration tests of the entire architecture becomes very difficult because there are so many services and teams involved.  Some organizationsend up doing a lot of testing in production (ie A/B testing or running a parallel instance with mirrored production traffic).

Any individual microservice might be simple and not-complex but the architecture overall with dozens to hundreds of microservices tends to be very complex.  The trade-offs made to allow small independent teams to deploy quickly isn’t the best goal for all organizations.

Introducing Miniservices

Organizations that aren’t as focused on small independent teams with fast deployment cycles might instead be focused on efficiently evolving a software system as the business changes.

  • Business functionality is grouped into related groupings but the strictness of the “Single Responsibility Service” is no longer a driving principal. I feel that a “Bounded Context” probably would be a good word to describe a “Collection of related business functions” but this means something different than how the word is typically used in a pure microservice architecture.  Multiple services will live inside a group of related business functionality.  I will call this a “Business Context” but it is possible others have come up with a better term for this. A “Business Context” is usually larger than than the “Bounded Context” that that define a single microservice.
  • The multiple related miniservices in a “business context” are allowed to share the same data-store (ie database). Not allowing two microservices to share a single database is one of the most expensive aspects of microservices.  This makes many types of common business results difficult to produce because they have to combine data from multiple services often at the cost of increased complexity. Allowing multiple related miniservices to share a database can significantly reduce the implementation costs for a lot of architectures.  There will often be multiple databases within a “business context” and some databases will be more closely related to(or mostly used by) some services but there is no hard-rule preventing multiple miniservices from using the database.  These decisions should be made on an individual basis while keeping efficiency in mind.
  • The data-store shared between multiple miniservices in the “business context” should have a well defined schema.  This schema can change but it needs to be well defined at any point in time and the data-store should enforce this.  NOSQL systems that don’t enforce a formal schema can work when a single service is the only user of that data but any data that will be used by multiple miniservices needs an enforced schema.
  • The multiple related miniservices are allowed to share code typically through libraries (ie JARS). Developers should easily be able to move code into a library that can allow the code to be shared with the related miniservices.
  • The teams responsible for the group of related miniservices should be related to each other on the organizational chart.  Ideally the same team should maintain the collection of “related miniservices”.  This allows for easier coordination of changes shared database schemas and shared libraries. Different people or subteams can be dedicated to some of the miniservices but it is important that they coordinate and work together when required
  • Each miniservice within a “business context” can still be modified and deployed on independent schedules.  The miniservices are still standalone deployable services.  Sometimes the changes and deployment of two related miniservices will need to be coordinated and possibly combined.  This is both common and acceptable in a miniservice architecture.  We rely on the team to figure out these dependencies and coordinate them.
  • All of the related miniservices in a  “Business Context” must be tested together as part of a pre-deployment integration test.  This is needed because it is possible that a change might impact a service without anyone realizing it.
  • Multiple related miniservices can be deployed on a single virtual machine or you can adopt an approach with docker to give each miniservice its own container.  A miniservice architecture is flexible enough to accommodate both.
  • An individual “release” will consist of one or more miniservices from the set of related services being deployed together.

Miniservices and distantly related services

The above section describes how you can break a bunch of related business functionality into miniservices and have them share database.  Some business functionality and the related services will end up being part of a different “business context”.   These services (and their associated business rules) will usually be maintained by a different team often living on a different part of the organizational chart.   The sharing of data-stores and code between these business context is discouraged.   Two miniservices from different “Business contexts” should probably not be writing data into the same data-store. Allowing a miniservice from a different context to pull(or query) data from the data-store belonging to another context might sometimes be done for practical or efficiency reasons,   This is discouraged but not outright forbidden.

 

 

 

 

 

 

 

The @Bean annotation is a great way to declare Spring beans when writing code.  This allows the programmer to control which beans get instantiated and with what arguments.   Some situations call for deciding if beans should be instantiated at runtime based on configuration this is often done with the @Conditional annotation on @Configuration classes.   The @Conditional annotation works well if you are enabling or disabling a particular bean through configuration but sometimes the number of beans needs to be completely dynamic.

This often comes up if your application needs to connect to a bunch of external services such as a database.  You want the number of  services or databases that the application connects to be configurable.

(more…)

Spring integration makes it easy to monitor an sftp server for new files and inject those files into your application for processing.  I like to configure my spring integration applications with annotations and @Bean configurations.  I found the documentation, and online examples for doing this annotations a bit lacking.

 

This post demonstrates the basics on how to connect get a spring-boot (1.3.x or 1.4.x) application to monitor a sftp server.

(more…)

Sometimes the behaviour of an application is controlled through properties
and the application needs to detect changes to the property file so it can switch to the new configuration. You also want to ensure that a particular request either uses the old configuration or the new configuration but not a mixture of old and new. Think of this as ACID like isolation for properties to ensure that your requests don’t get processed using an inconsistent configuration.

We accomplish this by using a property lookup bean (PropertyBean) that returns property lookups. The PropertyLookupBean is a request scoped bean meaning that a new instance is created for each request in a spring MVC application. The PropertyLookup bean has an init method that runs post construction and calls the PropertyCache bean to get the current set of properties. Each instance of the PropertyLookup bean will return the property values from the Properties object it gets at initialization.

properties

The property lookup bean has a reference to a singleton scoped PropertyCache. The PropertyCache maintains a reference to the current Properties object. When the PropertyCache bean is asked to return the current properties it checks to see if they have changed on disk and if so reloads the properties.

class PropertyBean {
    private Properties currentProperties;

    @Autowired
    PropertyCache propertyCache;

    @PostConstruct
    void doInit() {
        currentProperties = propertyCache.getProperties();
    }
    public String getProperty(String propertyName) {
        return currentProperties.getProperty(propertyName);
    }
}

 

The PropertyBean would look similar to the above sample class. This class simply caches an instance of a Properties object post-construction and delegates property requests to this instance.

The PropertyBean needs to be created as a request scoped bean. This can be done with a Spring annotation configuration class as follows.

@Configuration
class MyConfig {
    @Bean
    @Scope(value="request", proxyMode=ScopedProxyMode.TARGET_CLASS)
    public PropertyBean propertyBean() {
        return new PropertyBean();
    }
}

 

Managing the loading of properties falls to the PropertyCache class which would look similar to the below class

@Component
public class PropertyCache {
    private Properties currentProperties;
    private Long lastReloadTime=0;
    @Value("${dynamic.property.file.path")
    private String propertyFileName;


    public Properties getProperties()  {
         reloadIfRequiredProperties();   
         synchronized(this) {
             return currentProperties;
         }
    }

    private reloadPropertiesIfRequired() {
         FileInputStream stream = null;
         try {
             File f = new File(propertyFileName);
             if(f.lastModified() > lastReloadTime) {
                 Properties newProperties = new Properties();
                 stream = new FileInputStream(f);
                 newProperties.load(stream);
                 synchronized(this) {
                     lastReloadTime = System.getCurrentMillis();
                     currentProperties = newProperties;
                 }
             } 
          } 
    }
    catch(IOException exp) {
        //do something intelligent on the error
    }
    finally {
        if (stream != null)
            stream.close();
    }
}

 

All of the business logic code that needs properties to process a request will use the request scoped PropertyBean instance bean. This ensures that the entire request uses the same set of properties even if the underlying property file changes.

class SomeBusinessService {
    @Autowired
    private PropertyBean propertyBean;

   public void someOperation() {
      String v = propertyBean.getProperty("some.property");
       .
       .
    }

}

When a new request is started a new instance of the PropertyBean is created. The doInit() method invokes the PropertyCache getProperties() method. The Property Cache checks the timestamp of the file and looks for modifications, if it detects one it reloads the property file. Any PropertyBean instances that have already been initialized will continue to use the old properties instance.

PostgreSQL is becoming a more popular choice for an embedded database because of its BSD license, relatively low memory footprint and great list of features. A few people have asked me if Slony would be a good choice for replication in an embedded environment. Embedded deployments haven’t been a primary use-case for Slony and some of the challenges you would face are worth writing about.

(more…)

Last week I was in Chicago giving a talk at PostgresOpen on managing PostgreSQL with puppet. The talk was well attended and appears to have been well received.

Puppet is configuration management software that allows you to describe how your servers should look using a declarative syntax. You describe what packages you want to install (obviously postgres) and how your configuration files should look. Puppet also allows you to run commands to create databases or database objects such as users.

In my talk I discuss why it is important to use a repeatable procedure for building production database servers and how this is a tool in bridging the divide between developers and operations staff.

I talk about how deploying servers with automation allows your servers to be similar. Similar might not mean identical but the differences between your database servers is controlled and managed. This also applies to your development and QA servers. If you deploy your staging, QA, and development servers using the same puppet manifest as your production servers but with possibly different configuration options then you will be more confident in your testing.

You can view my slides. They recorded the talk and I will update this post with a link to the talk when it is posted.

Updated: You can view a recording of the video below

This weekend we had the second annual Toronto OpenStreetMap developer weekend. The nice folks at the Ryerson Department of Geography hosted us. My focus this weekend was to work the Serge and Martijn on maproulette

Maproulette is software that presents an easy to do mapping task to users which they can complete and then mark the task as completed. Examples of past maproulette mapping challenges include fixing connectivity errors or fixing objects touched by the license change.
(more…)

I found the documentation on urllib2 a bit unclear about how to get cookie handling working properly.

I was working on a python script that needed to contact the OpenStreetMap web server, login with my OSM credentials and interact with the website.

The first step is to setup a urllib2 opener instance that is configured to store cookies.

import cookielib,Cookie,urllib2,urllib
import xml.etree.cElementTree as ElementTree

cookies = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookies))

This will create an opener that can be used to retrieve URL’s. Any cookies set in the HTTP response will be stored in the cookie’s cookie jar. If I needed to add additional openers (ie for special redirect handling) I would just add them as additional parameters to the build_opener call. ie urllib2.build_opener(handler1, handler2,handler3)…

Next we need to contact OpenStreetMap to get a blank login form. The blank login screen has a hidden variable ‘authenticity_token’ that needs to be passed back as part of the POST with my login credentials

inptag = '{http://www.w3.org/1999/xhtml}input'
formtag = '{http://www.w3.org/1999/xhtml}form'
# fetch the blank login form
response_tokenfetch = opener.open(request)
html = response_tokenfetch.read()
htmlfile=StringIO.StringIO(html)
# parse the HTML elements in the form
# extract any input fields for later resubmission
# this will pick up the authenticity_token and anything else
xml_tree = ElementTree.parse(htmlfile)
for form in xml_tree.getiterator(formtag):                
    for field in form.getiterator(inptag):
        if 'name' in field.attrib and 'value' in field.attrib:
                login_payload[field.attrib['name']] = field.attrib['value']
login_payload['username'] = username
login_payload['password'] = password
login_payload['remember_me'] = 'yes'
login_payload['cookie_test'] = 'true'

Next we submit the LOGIN request as a POST. Any session cookies returned as part of the blank form will be added to the second request.

cookies.add_cookie_header(request)
response = opener.open(request,urllib.urlencode(login_payload))

If our login was successful then cookies contains an _osm_session and _osm_username that will be used in subsequent API calls.

request2=urllib2.Request('http://api06.dev.openstreetmap.org/user/stevens/inbox)
cookies.add_cookie_header(request2)
response2 = opener.open(request2)
html=response2.read()

You could then parse the HTML to extract a list of messages.
If your using the formal OpenStreetMap API (ie calls under /api/0.6/…) then you should instead use oauth for authentication instead of logging in through the website. Some OSM features such as messaging can only be accessed by pretending to be a web session and parsing/faking HTML.

I spent the weekend attending Pycon Canada where I gave a talk on Pl/Python. I want to thank the conference organizers for putting on an excellent conference. I am told that this was the first time Pycon had a regional conference in Canada and that it was put together by a group of volunteers in less than 6 months.

One of my favourite parts of local/regional conferences held on weekends is that they tend to attract attendees who are passionate about computers and technology. The people who I spoke with at the conference were there because they wanted to be there,not because there boss wanted them to be there, and either loved Python or wanted to learn more about it. I’ve attended many great PostgreSQL conferences over the past few years but it was nice to spend sometime talking with people from broader development backgrounds.

In my discussions with people at the conference I noticed a trend. People I spoke with who are working at companies that did Python development tended to be using PostgreSQL. The ones that weren’t currently using PostgreSQL were using MySQL and talking about moving to PostgreSQL or were apologetic for still being on MySQL. The MySQL users were often apologizing before I told them that I was a PostgreSQL contributor. Some of the MySQL users also mentioned that they were using non-Oracle forks like Percona.

This was in contrast to the people at the Python conference that described their workplaces as doing primarily Java development. The Java development shops tended to be using Oracle or SQL Server. I admit that the sample size of of the Java developers wasn’t that big (this was a Python conference after all) but my observations are worth keeping in mind since they might be indicating a pattern. Other people have commented about the popularity of PostgreSQL in the Ruby community.

I wonder how much of this observations is because older written in Java are already using SQL Server/Oracle and there hasn’t been a strong enough driver to change to PostgreSQL. While newer software projects are tending to choose Python or Ruby over Java and at the same time picking a FLOSS database such as PostgreSQL where they don’t have to worry about migrating a legacy application.

My talk on writing stored functions in Pl/Python was well received. A lot of people saw appeal in being able to write their stored functions in Python instead of pl/SQL but that shouldn’t be a surprise considering this was a Python conference.

My slides are available here, the video of the talk is posted at pyvideo