Quantcast
Channel: CraigTP's Blog
Viewing all 103 articles
Browse latest View live

Another Microsoft Certification acquired!

$
0
0

MS(rgb)

Ever since gaining my MCTS (Microsoft Certified Technology Specialist) and MCPD (Microsoft Certified Professional Developer) certificates at the end of 2011 and the early part of 2012 I’ve had an appetite to acquire more.  Life seemed to get in the way of this during 2012 so that was, unfortunately, a quiet year on the certification front.

 

Well, it’s now 2013, and Microsoft have recently revamped a lot of their certification offerings.  A new type of certification that they’ve introduced is that of a Microsoft Specialist.  The Microsoft Specialist certification seem to be a replacement for the old MCTS (Microsoft Technology Specialist) and is effectively a certificate awarded for showing competence in a specific piece of Microsoft Technology, of which there are quite a number.

 

During the latter part of 2012 and the early part of 2013, Microsoft were running a promotion to take a free exam.  This was exam 70-480 – Programming in HTML5, JavaScript & CSS3.  Successfully passing this exam would award the exam taker the certification of Microsoft Specialist – Programming in HTML5, JavaScript & CSS3.

 

Well, towards the end of February of this year, I sat and successfully passed the exam acquiring the certification of Microsoft Specialist – Programming in HTML5, JavaScript & CSS3.

 

This is one of three exams that, when all three are successfully passed, will gain the new style Microsoft Certified Solution Developer – Web Applications certificate.  I guess the rest of this year’s certification journey has just been mapped out!


DDD East Anglia Conference Write-Up

$
0
0

logo-smallThis past Saturday, 29th June, saw the inaugural DDD East Anglia conference.  This is the latest addition to the DeveloperDeveloperDeveloper events that take place all over the UK and sometimes around the world!  I was there, this being only my second ever DDD event that I’d attended.

 

DDD East Anglia was set on the grounds of the extensive Cambridge University in a building called “The Hauser Forum”.  For those attendees like myself who were unfamiliar with the area or the university campus, it was a little tricky to find exactly where to go.  The DDDEA team did make a map available on the website, but the designated car park that we were to use was cordoned off when I arrived!  After some driving around the campus, and with the help of a fellow attendee who was in the same situation as myself, I managed to find another car park that could be used.  I must admit that better signposting for both the car parking and the actual Hauser Forum building would have helped tremendously here.  As you can imagine, Cambridge University is not a small place and it’s fairly easy to get lost amongst the myriad of buildings on campus.

 

The event itself had 3 parallel tracks of sessions, with 5 sessions throughout the day in each track.  As is often the case with most DDD Events, once the agenda was announced, I found myself in the difficult position of having to choose one particular session over another as there were a number of timeslots where multiple sessions that I’d really like to attend were running concurrently.  As annoying as this can sometimes be, it’s testament to quality and diversity of the sessions available at the various DDD events.  East Anglia’s DDD was no different.

 

As I’d arrived slightly late (due to the car parking shenanigans) I quickly signed in and went along to seminar room 1 where the introduction was taking place.  After a brief round of introductions, we were off and running with the first session.  There had been some last minute changes to the published agenda, so without knowing quite where I wanted to be, I didn’t have to move from Seminar Room 1 and found myself in Dave Sussman’s session entitled, “SignalR: Ready For Real-Time”.

 

Dave started out by talking about how SignalR is built upon the notions of persistent connections and hubs on the server-side and that hubs are built on top of persistent connections and simply offer a higher level of abstraction.  SignalR, at it’s most basic, is a server-side hub (or persistent connection) through which all messages and data flows and this hub then broadcasts that data back to each connected client.  Other than this, and for the client-side of the equation, Dave tells us that SignalR is effectively one big jQuery module!

 

Some of the complexities that SignalR wraps up and abstracts away from the developer is the requirement to determine the best communication protocol to use in a given situation.  SignalR uses web sockets as the default method of communication, if the client supports such a protocol.  Web Sockets are a relatively new protocol that provide true duplex communication between client and server.  This facilitates cool functionality such as server-side push to the client, however, if Web Sockets are not available SignalR will seamlessly “downgrade” the connection protocol to HTTP long-polling – which uses a standard HTTP connection that is kept alive for a long time in order to receive the response from the server.

 

Dave starts to show us a demo, which fails to work the first time.  Dave had planned this however, and proceeded to tell us about one of the most common problems in getting a simple SignalR demo application up and running: Adding the call to .MapHubs (a requirement to register the routes of the Hubs that have been defined on the server-side) after all of the other route registration has been done.  This causes SignalR to fail to generate some dynamic JavaScript code that is required for the client.  The resolution is simply to place the call to .MapHubs before any calls to the other MVC route registrations.

Dave tells us that SignalR doesn’t have to be in the browser.  We can create many other types of application (Console apps, WinForms Apps etc.) that connect to the HubConnection object over HTTP and then we can create a proxy in the client application that can send and receive the required messages over HTTP to the SignalR hub on the server!  Also, although it’s seen as an ASP.NET addition, SignalR isn’t dependent upon the ASP.NET runtime.  You can use it, via JavaScript, in a simple standalone HTML page with just the inclusion of a few JavaScript files and a bit of your own JavaScript to create and interact with a jQuery $._hubConnection which allows sending and receiving messages and data to the SignalR server-side Hub.

 

Although SignalR’s most used and default function on the Hub is the ability to broadcast a received message back to all connected clients (as in the ubiquitous “chat” sample application), SignalR has the ability to have the server send messages to only one or more specific clients.  This is done by sending a given message to a specific client based upon that client’s unique ID, which the Hub keeps in an internal static list of connected clients.  Clients can also be “grouped” too if needed so that multiple clients (but not all connected clients) can receive a certain message.  There’s no inherent state in the server-side Hub.  Therefore, things like the server-side collection of connected clients is declared as static to ensure state is maintained.  The Hub class itself is re-instantiated with each message that needs processing.  Methods of the Hub class are Tasks, so clients can send a message to the server hub and be notified sometime later when the task is completed (for example if performing some long running operation such as database backup etc.)

 

We’re told that there’s no built-in persistence with SignalR, and we should be aware that messages can (and sometimes do!) get lost in transit.  SignalR can, however, be configured to run over a message bus (For example, Windows Azure Message Bus) and this can provide the persistence and improved guarantee of delivery of messages.

 

Finally, although the “classic” demo application for SignalR is that of a simple “chat” application with simple text being passed back and forth between client and server, SignalR is not restricted to sending text.  You can send any object!  The only requirement here is that the object be able to be serialized for communication over the wire.  The objects that can be passed across the wire are serialized, by default, with JSON (internally using Newtonsoft’s JSON2 library).

 

After a quick break, during which there was tea, coffee and some lovely Danish pastries available, I was back in the same Seminar Room 1 to attend Mark Rendle’s session entitled, “The densest, fastest-moving talk ever”.  Mark’s session wasn’t really about any single topic, but consisted of Mark writing and developing a simple browser-based TODO List application that utilised quite a number of interesting technologies.  These were Mark’s own Simple.Web and Simple.Data, along with some TypeScript, AngularJS and Bootstrap!

 

As a result of the very nature of this session, which contained no slides, demos or monologue regarding a specific technology, it was incredibly difficult to take notes on this particular talk.  It was simply Mark, doing his developer thing on a big screen.  All code.  Of course, throughout the session, Mark would talk about what he was doing and give some background as to the what and the why.  As I’m personally unfamiliar with TypeScript and AngularJS it was at times difficult to follow along with what and why Mark was making the choices he did when choosing to utilise one or more of these technologies.  Mark’s usage of his own Simple.Web and Simple.Data frameworks were easier to understand, and although I’ve not used either of these frameworks before, they both looked incredibly useful and lightweight to allow you to get basic database reading & writing within a simple web-based application up and running quite quickly.

 

After 30 minutes of intense coding including what appeared to be an immense amount of set-up and configuration of AngularJS routing, Mark can show us his application which is displaying his TODO items (from his previously prepared SQL Server database) in a lovely Bootstrap-styled webpage.  We’re only reading data at the moment, with no persistence back to the DB, but Mark spends the next 30 minutes plugging that in and putting it all together (with even more insane AngularJS configuration!).  By the end of the session, we do indeed have a rudimentary TODO List application!

 

I must admit that I feel I would have got a lot more from this session if I already knew more about the frameworks that Mark was using, specifically AngularJS which appears to be a rather extensive framework that can do everything you’d want to do in client-side JavaScript/HTML when building a web application.  Nonetheless, it was fun and enjoyable to watch Mark pounding out code.  Also, Mark’s inimitable and very humorous style of delivery made this session a whirlwind of information but really fun to attend.

 

Another break followed after Mark’s session with more tea, coffee and a smorgasbord of chocolate-based snacks positioned conveniently on tables just outside of each seminar room (more on the food later!).  Once the break was over, it was time for the final session of the morning and before the lunch break.  This one was Rob Ashton’s “Outside-In Testing of MVC”.

 

Rob’s opening gambit in this session is to tell us that his talk isn’t really about testing MVC, but its about testing general web applications and will include some MVC within it.  A slight bait and switch, and clearly Rob’s style of humour.  He’s mostly a Ruby developer these days so there’s little wonder there’s only a small amount of MVC within the session!  That said, the general tone of the talk is to explore ways of testing web applications from the outermost layer – the User Interface – and how to achieve that in a way that‘s fast, scalable and non-brittle.  To that extent, it doesn’t really matter what language the web application under test is written in!

 

Rob talks about TDD and that often people trying to get started in TDD often get it wrong.  This is very similar to what Ian Cooper talked in his “TDD, where did it all go wrong?” talk that he’s given recently in a number of places.  I attended Ian’s talk at a recent local user group.  Rob says he doesn’t focus so much on “traditional” TDD, and that often having complete tests that start at the UI layer and test a discrete piece of functionality in a complete end-to-end way is very often the best of all testing worlds.  Of course, the real key to being able to do this is to keep those tests fast.

 

Rob says he’s specifically avoiding definitions in his talk.  He just wants to talk about what he does and how it really helps him when he does these very things.  To demonstrate, he starts with the example of starting a brand new project.  He tells us that if we’re working in brownfield application development, we may as well give up all hope!  :(

 

Rob says that we start with a test.  This is a BDD style test, and follows the standard “Given, When, Then” format.  Rob uses CoffeeScript to write his tests as its strict handling of white-space forces him to keep his tests short, to the point, and easily readable, but we can use any language we like for the tests, even C#.

 

Rob says he’s fairly dismissive of the current set of tools often used for running BDD tests, such as Cucumber.  He says it can add a lot of noise to the test script that’s really unnecessary and often causes the test script wording to become more detached and abstracted from what it is the test should actually be doing in relation to the application itself.  So we are asked the question, What do we need to run our test?” – Merely a web browser and a web server!

 

In order to keep the tests fast, we must use a “headless” browser.  These are implementations of browser functionality but without the actual UI and Chrome of a real web browser.  One such headless browser is PhantomJS.  Using such a tool allows us to run a test that hits our webpage, performs some behaviour – adds text to a textbox, clicks a button etc. – and verifies the result of those actions, all from the command line.  Rob is quick to suggest that we shouldn’t use PhantomJS directly as then our tests will be tightly coupled to the framework we’re running them within.  Rob suggests using WebDriver (part of the Selenium suite of web browser automation tools) in conjunction with PhantomJS as that will provide a level of abstraction, and thereby not coupling the tests tightly to the browser (or headless browser) being used.  This level of abstraction is what allows the actual test scripts themselves to be written in any language of our choosing.  It just needs to be a language that can communicate with the WebDriver API.

 

Rob then proceeds to show us a demo of running multiple UI tests in a console window.  These tests are loading a real webpage, interacting with that page – often involving submission of data to the server to be persisted in someway, and asserting that some action has happened as a result of that interaction.  They’re testing the complete end-to-end process.  The first thing to note is that these tests are fast, very fast!.  Rob is spitting out some simple diagnostic timings with each test, and each test is completing in approx. 500ms!

 

Rob goes on to suggest ways of ensuring that, when we write our tests, they’re not brittle and too closely tied to the specific ID’s or layout of the page elements within the page that we’re testing.  He mentions one of the best tools to come from the Ruby world, Capybara.  Rob says that there’s a .NET version of Capybara called Coypu although it’s not quite as feature-complete as Capybara.  Both of these tools aim to allow intelligent automation of the browser testing process and help make tests readable, robust, fast to write with less duplication and less tightly coupled to the UI.  They help to prevent brittle tests that are heavily bound to UI elements.  For example, the tools try multiple ways to fill in a “username” textbox when instructed by first looking for the specific ID, but then intelligently looking for a <label for=”username”> if the ID is not found and using the textbox associated with the label.  If that’s not found, the tool will then intelligently try to find a textbox that happens to be “near” to where some static text saying “Username” may be on the page.

 

Rob suggests not to bother with “fast” unit tests.  He says to make your UI tests faster!  You’ll run them more frequently, and a fast UI test means a fast UI.  If we achieve this, we’re not just testing the functionality, but by ensuring we have a suite of UI tests that run very fast, we will by virtue of that, have an actual application and UI that runs very fast.  This is a win-win situation!

 

Rob proceeds to build up a demo application that we can use to run some tests against.  He does this to show us that he’s not going to concern himself with databases or persistence at this point – he’s only storing things in an in-memory collection.  Decisions about persistence and storage should come at the very end of the development as by then, we’ll have a lot more information about what that persistence layer needs to be (i.e.. a document database, SQL Server for more complex queries etc.)  This helps to keep the UI tests fast!

 

Rob then proceeds to give us a few tips on MVC-specific development, and also about how to compose our unit tests when we have to step-down to that level.  He says that our Controllers should be very lightweight and that we shouldn’t bother testing them.  You’ve got UI tests that cover that anyway.  He states that, “If your controller has more than one IF statement, then it shouldn’t!”.  Controllers should be performing the minimal amount of work.  Rob says that if a certain part of the UI or page (say a register/signup form) has complex validation logic, we should test that validation in isolation in it’s own test(s).   Rob says that ActionFilters are bad.  They’re hard to test properly (usually needing horrible mocking of HTTPContext etc.) and they often hide complexity and business logic.  This logic is better placed in the model.  We should also endeavour have our unit-level tests not touch any part of the MVC framework.  If we do need to do that, have a helper method that abstracts that away and allows the test code to not directly touch MVC at all.

 

To close, Rob gives us the “key takeaways” from his talk:  Follow your nose, focus on the pain and keep the feedback loop fast.  Slides for Rob’s talk are available here.

 

20130629_124147After Rob’s talk, it was time for lunch.  This was provided by the DDDEA team, and consisted of sandwiches, crisps, a drink and even more chocolate-based confectionary.  There was also the token gesture of a piece of fruit, I suppose to give the impression that there was some healthy items within there!

 

There was even the ability to sit outside of the main room in the Hauser Forum and eat lunch in an al-fresco style.  It was a beautiful day, so many attendees did just that.  The view from the tables on this balcony was lovely.

 

20130629_124107As is often the case at events such as these, there were a number of informal “grok” talks that took place over the lunchtime period.  These are usually 10 minutes talks from any member of the audience that cares to get up and talk about a subject that interests them or that they’re passionate about.

 

Since I was so busy stuffing my face with the lovely lunch that was so kindly provided, I had missed the first of the grok talks.  I managed to miss most of the second grok talk, too, which was given by Dan Maharry about his experiences writing technical books.  As I only caught the very end of Dan’s talk, I saw only one slide upon which was the wise words, "Copy editors are good. Ghost-writers are bad."  Dan did conclude that whilst writing a technical manual can be very challenging at times it is worth it when, three months after completing your book, you receive a large package from the publishers with 20-30 copies of your book in there with your own name in print!

 

The last grok talk, which I did catch, was given by Richard Duttton on life as a Software Team Lead at Red Bull Racing.  Richard  spoke about the team itself and what kind of software they produce to help the Formula 1 team build a better and faster car.  Richard answered the question of “What’s it like to work in F1?”.  He said there’s long hours and high pressure but it’s great to travel and see how the software you write affects the cars and the race first hand.

 

Red Bull Racing development team is about 40 people strong.  About half of these have a MATLAB background rather than .NET/C#.  Richard’s main role is developing software for data distribution and analysis.  He writes software that is used on the race pit walls as well as being used back at HQ.  They can get a data reading from the car back to the IT systems in the HQ within 4 seconds from anywhere in the world!  The main data they capture is GPS data, Telemetry data and Timing data.  Within each of these categories of data, there can be 1000’s of individual data points that are captured.

 

Richard spoke about the team’s development methodology and said that they do “sort-of” agile, but not true agile.  It’s short sprints that align with the F1 race calendar.  There are debriefs after each race.  When 2 races are back-to-back on consecutive weekends, there’s only around 6 hours of development time between these two races!

 

The main software languages used are C#.NET 4.5 (VS2010 & VS2012 with TFS and centralized build system) but they mostly develop WPF applications (with some legacy WinForms stuff in there as well).  There’s also a lot of MATLAB.  They still have to support Windows XP as an OS as well as more modern platforms like Windows Phone 8.

 

After the lunch time and grok talk sessions were over, it was back to the scheduled agenda.  There were two more sessions left for the day, and my first talk of the afternoon was Ashic Mahtab’s “Why use DDD, CQRS and Event Sourcing?

 

Ashic starts by ensuring that everyone is familiar with the terminology of DDD, CQRS and Event Sourcing.  He gives us the 10 second elevator pitch of what each acronym/technology is to ensure we know.  He says that he’s not going to go into detail of what these things are, but rather why and when you should use them.  For the record, DDD is Domain-Driven Design, CQRS is Command Query Responsibility Segregation and is about having two different models for reading vs. writing of data, and Event Sourcing is about not writing a lot of changes to a database record in one go, but to write the different stages of changes to the record over time.  The current state of the record is then derived by combining all the changes (like delta differences).

 

He says that very often applications and systems are designed as “one big model”.  This usually doesn’t work out so well in the end.  Ashic talks about the traditional layered top-down N-Tier architecture and suggests that this is a bad model to follow these days.  Going through many layers makes no sense, and this is demonstrated especially well when looking at reading vs. writing data – something directly addressed by CQRS.  Having your code go through a layer that (for example) enforces referential integrity when only needing to read data rather than writing it is unnecessary as referential integrity can never be violated when reading data, only when data is being written.

 

Ashic continues his talk by discussing the notion of a ubiquitous language and that very often, a ubiquitous language isn’t ubiquitous.  Different people calls things by different names.  This is often manifested between disparate areas of an enterprise.  The business analysts may call something by one name, whilst the IT staff may call the same thing by a different name.    We need to use a ubiquitous language, but we also need to understand that it’s often only ubiquitous within a “bounded context”.  A bounded context is a ring-fenced area where a single “language” can be used by everyone within that area of the enterprise and is ubiquitous within that context.  This provides a “delimited applicability of a particular model, gives team members a clear and shared understanding of what has to be consistent and what can develop independently.”.

 

Ashic goes on to talk about the choices of the specific technology stack and how those choices can impact many areas of a project.  A single technology stack, such as the common WISA Microsoft-based stack for web applications (the Windows/Microsoft equivalent of the even more common LAMP stack) can often reduce IT expenditure within an enterprise, but those cost savings can be mitigated by the complexity of developing part of a complete system using a technology that’s not an ideal fit for the purpose.  An example may be using SQL Server to store documents or binary data, when a document –oriented database would be a much more appropriate solution.

 

Ashic tells a tale of his current client who only have one big model for their solution that comprises of around 238 individual projects in a single Visual Studio solution.  A simple feature change that was only 3 lines of code required the entire solution to be redeployed.  This in turn required testing/QA, compliance verification and other related disciplines to be re-performed across the entire solution even though only a tiny portion had actually changed.  The “one big model” had forced them into this situation, whereas multiple, separate models communicating between each other by passing messages in a service-oriented approach would have facilitated a much smaller footprint for deployment, and thus a smaller sized application that needed testing and verification.

 

Ashic tells us that event sourcing over a RESTful API is a good thing.  Although there’s the possibility of the client application dealing with slightly stale data, it’ll never be “wrong” data as the messages passed over this architecture are immutable.  Also, if you’re using event sourcing, there’s no need to concern yourself about auditing and logging, all individual changes are effectively audited anyway by virtue of the event sourcing mechanism!  Ashic advises caution when applying event sourcing and consideration should be given to where not to apply it.  If all you’re doing in a certain piece of functionality is retrieving a simple list of countries from a database table that perhaps contains only one or two columns, it’s overkill and will only cause you further headaches if applied.

 

He states that versioning of data is difficult in a relational database model.  You can achieve rudimentary versioning with a version column on your database tables, or a separate audit table, however this is rarely the best approach or delivers the best performance of design.  Event sourcing can significantly help in this regard, too, as rather than versioning an entire record (say a “customer” record which may consist of dozens of fields), you’re versioning a very small and specific amount of data (perhaps only a single field of the customer record).  The event sourcing message that communicates the change in the one field (or a small number of fields) effectively becomes the version itself as multiple changes to that field(s) will be sent over several different immutable messages.

 

The talk continues with an examination of many of the various tools and technologies that we use today. Dependency Injection, Object mapping (with Automapper for example) and Aspect-Oriented programming.  Ashic ponders whether these things are really good practice.  Not so much the techniques that (for example) a dependency-injection container performs, but whether we need the container itself at all.  He says that before DI Containers came along, we simply used the factory pattern and wrote our own classes to perform the very same functionality.  Perhaps where such techniques can be written by ourselves, we should avoid leaning upon third-party libraries.  After all, dependency injection can often be accomplished in as little as 15 lines of code!  For something a little more complicated such as aspect-oriented programming, Ashic uses the decorator pattern instead.  It’s tried and trusted, and doesn’t re-write the IL of your compiled binaries – something which makes debugging very difficult - like many AoP frameworks do.

 

Ashic concludes his talk by restating the general theme.  Don’t use “one big model” to design your system.  Create bounded contexts, each with their own model and use a service-oriented architecture to pass messages between these model “islands”.  The major drawback to using this approach to the design of your system is that there’s a fair amount of modelling work to do upfront to ensure that you properly map the domain and can correctly break that down into multiple discreet models that make sense.

 

20130629_150120After Ashic’s talk, there was the first afternoon break.  Each of the three seminar rooms shuffled out into the main hall area to be greeted with a lovely surprise.  There was a table full of delicious local cheeses, pork pies, grapes and artisan bread, lovingly laid on by Rachel Hawley and the generous folks at Gibraltar software (Thanks guys! I think you can safely say that the spread went down very well with the attendees!)

 

So after we had all graciously stuffed our faces with the marvellous ploughman’s platter, and wet our whistles with more tea and coffee, it was time for the final session of the day.

 

The final session for me was Tomas Petricek’s “F# Domain-Specific Languages”.

 

Tomas starts out by mentioning that F# is a growing language with a growing community across the world - user groups, open source projects etc.  It’s also increasingly being used in a wide variety of companies across many different areas. Credit Suisse use F# for financial processing (perhaps the most obvious market for F#) but the language is also used by companies like Kaggle for machine learning and also companies like GameSys for developing the server-side components used within such software as Facebook games.

 

Tomas then demos a sample 3D Domain-specific language (or DSL) that composes multiple 3d cylinders, cones and blocks to composite ever more elaborate structures from the component parts.  He shows building a 3D “castle” structure using these parts that combines multiple functions from the domain-specific language, underpinned by F# functions.  He shows that the syntax of the DSL contains very little F# code, only requiring a small number of F# constructs when we come to combine the functions to create larger, more complex functions.

 

After this demo, Tomas moves on to show us how a DSL for European Call and Put stock options may look.  He explains what call and put options are (Call options are an agreement to buy something at a specific price in the future and Put options are an agreement to sell something at a specific price in the future) and he then shows some F# that wraps functions that model these two options.

 

Whilst writing this code, Tomas reminds us that everything in F# is statically typed.  Also that everything is immutable.  He talks about how we would proceed to define a domain-specific language for any domain that we may wish to model and create a language for.  He says that we should always start by examining the data that we’ll be working with in the domain.  It’s important to identify the primitives that we’ll be using.  In the case of Tomas’ stock option DSL, his primitives are the call and put options.  It’s from these two primitive functions that further, more complex functions can be created by simply combining these functions in certain ways.  The call and put functions calculate the possible gains and/or losses for each of the two options (call or put) based upon a specific current actual price. Tomas is then able to “pipeline” the data that is output from these functions into a “plot” function to generate a graph that allows us to visualize the data.  He then composes a new function which effectively “merges” the two existing functions before pipelining the result again to create another graph that shows the combined result set on a single graph.  From this we can visualize data points that represent the best possible price at which to either buy or sell our options.

 

Tomas tells us that F# is great for prototyping as you’re not constrained by any systems or frameworks, you can simply write your functions that accept simple primitive input data, process that data, then output the result.  Further functions are then simply  composed of those more basic functions, and this allows for very quick testing of a given hypotheses or theory.

 

For some sample F# code, Tomas models the domain first like so:

type Option =
| EuropeanPut of decimal
| EuropeanCall of decimal
| Combine of Option * Option

This is simply modelling the domain and the business language used within that domain.  The actual functionality to implement this business language is defined later.

 

Tomas then decides that the definition can actually be rewritten to something equivalent but slightly better like so:

type OptionKind = Put | Call

type Option =
| European of OptionKind * decimal
| Combine of Option * Option
He can then combine these two put/call options/functions like so:
let Strangle name lowPrice highPrice = 
    Combine
       (  European(Put, name, lowPrice),
          European(Call, name, highPrice)  )

Strangle is the name of a specific type of option in the real world of stock options and this option is a combination of call and put options that are combined in a very specific way.  The function called Strangle is now defined and is representative of the domain within which it is used.  This makes it a perfect part of the domain-specific language.

 

Tomas eventually moves onto briefly showing us a DSL for pattern detection.  He shows a plotted graph that can go up and down across the x axis and how we can used F#-defined DSL-specific functions to detect that movement up or down.  We start by defining the “primitives”.  That could be the amount of the movement (say, expressed in pixels or some other arbitrary unit we decide to use), and then a “classifier”.  The classifier tells us in which direction the movement is (i.e. up or down).  With these primitives defined, we can create functions that detect this movement based upon a certain amount of points that are plotted on our graph.  Although Tomas didn’t have time to write the code for this as we watched (we were fairly deep into the talk at this point with only a few minutes left), he showed the code he had prepared earlier running live on the monitor in front of us.  He showed how he could create multiple DSL functions, all derived from the primitives, that could determine trends of the movement of the plotted graph over time.  These included detection of:  Movement of the graph upwards, Movement of the graph downwards and even Movement of the graph in a specific bell curve-style (i.e. a small downwards movement, immediately followed by an upwards movement).  For each of these combined functions, Tomas was able to apply them to the graph in real-time, by simply “wiring up” the graph output - itself a DSL function, in this case a recursive one that simply returned itself with new data with every invocation – to the detection functions of the DSL.

 

At this point, Tomas was out of time, however what we had just seen was an incredibly impressive display of the expressiveness, the terseness, and the power of F# and how domain-specific languages created using F# can be both very feature rich and functional (pardon the pun!) with the minimum of code.

 

At this point, the conference was almost over.  We all left our final sessions and re-gathered in the main hall area to finish off the lovely cheese (amazingly there was still some left over from earlier on!) and wait whilst the conference organisers and Hauser Forum staff rearranged the seminar rooms into one big room in which the closing talk from the conference organisers would be given.

 

After a few minutes, we all shuffled inside the large room, and listened as the DDD organisers thanked the conference sponsors, the speakers, the university staff and finally the attendees for the making the conference the great success that it was. And it was indeed a splendid event.  There were then some prizes and various items of swag to be given away (I didn’t win anything :( ) but I’d had a fantastic day at a very well organised and well run event and I’d learned a lot, too.  Thanks to everyone involved in DDDEA, and I hope it’s just as good next year!

I’m now a Microsoft Certified Solutions Developer!

$
0
0

MCSD_2013(rgb)_1477Ever since my last post about Microsoft Certification, I’ve been slowly beavering away to study and take the two remaining exams that would allow me to be recognised as a Microsoft Certified Solutions Developer: Web Applications.  I had passed the second exam back in April, and this past Saturday, I took the final exam required to gain the Microsoft Certified Solutions Developer (MCSD) certificate.  I’m pleased to say that I passed.

 

The MCSD: Web Applications certification is an interesting one and has allowed me to brush up my study and skills on a number of languages and technologies despite me using most of these day in and day out in my day job!

 

The first exam (70-480 – Programming in HTML5, JavaScript & CSS3) that I took back in February required some study of HTML5 and CSS3, although I’m using both of these technologies in work.  Additional study certainly helps, though, as I was probably only using a small amount of these technologies as I’m mainly working on legacy code.  Passing this exam, as well as being one of the three exams required for the MCSD certificate, came with it’s own certificate – a Microsoft Specialist certificate.

 

The second exam (70-486 - Developing ASP.NET MVC 4 Web Applications) was relatively simple.  This exam was all about building ASP.NET MVC 4 applications and this was an area that I had a good amount of knowledge and experience of.  I’m using ASP.NET MVC frequently within my day job when trying to improve our legacy code base one small piece at a time, and I’m also using ASP.NET MVC in some of my own software that I write for fun in my spare time.

 

The third and final exam (70-487 - Developing Windows Azure and Web Services) was the really interesting one of the three as this was all about web services (primarily using ASP.NET MVC Web API) but specifically using Windows Azure as the deployment platform.  I had only really started to scratch the surface of ASP.NET MVC Web API and, although I was aware of Windows Azure’s existence, I’d never used it at all.  Around 8 weeks prior to taking this exam, I started to look into studying more of ASP.NET MVC Web API and tried to implement some of it’s functionality in my own spare-time projects.  I was also somewhat fortuitous that Microsoft happened to be running a free trial promotion with Windows Azure whereby you could receive $200 of free credit that would last 1 month during which time you can spend that on any Azure services you like.  This enabled me to not only read about and study Windows Azure from books, articles and blogs but to also get some real-world hands-on experience with the platform.  This was a big boon and I believe it helped me immensely when it came to taking this final exam.

 

So, having now acquired the MCSD certificate, where to from here?  Well, I’m not sure.  I’m very happy with the certification I’ve been able to obtain thus far, so I’ll enjoy that for the time being and think about my next move later in the year.

DDD North 2013 In Review

$
0
0

dddnorthlogo

On Saturday 12th October 2013, in a slightly wet and windy Sunderland, the 3rd DDD North Developer conference took place.  DDD North events are free one day conferences for .NET and the wider development community, run by developers for developers.  This was the 3rd DDDNorth, and my 3rd DDD event in general (I’d missed the first DDD North, but did get to attend DDD East Anglia earlier this year) and this year’s DDDNorth was better than ever.

 

The day started when I arrived at the University Of Sunderland campus.  I was travelling from Newcastle after having travelled to the North-East on the Friday evening beforehand.  I’m lucky in that I have in-laws in Newcastle so was staying with them for the duration of the weekend making the journey to Sunderland fairly easy.  Well, not that easy.  I don’t really know Sunderland so I’d had to use my Sat-Nav which was great until we got close to the City Centre at which point my Sat-Nav took me on an interesting journey around Sunderland’s many roundabouts! :)

 

I eventually arrived at the Sir Tom Cowie Campus at the University of Sunderland and parked my car, thanks to the free (and ample) car parking provided by the university.

20131012_083814

I’d arrived reasonably early for the registration, which opened at 8:30am, however there was still a small queue which I dutifully joined to wait to be signed in.  Once I was signed in, it was time to examine the goodie bag that had been handed to me upon entrance to what was inside.  There was some promotional material from some of the great sponsors of the events as well as a pen (very handy, as I always forget to bring pens to these events!) along with other interesting swag (the pen-cum-screwdriver was a particularly interesting item).

 

The very next task was to find breakfast!  Again, thanks to some of the great sponsors of DDDNorth, the organisers were able to put on some sausage and bacon breakfast rolls for the attendees.  This was a very welcome addition to the catering that was provided last time around at DDD North.

 

20131012_084326

Once the bacon roll had been acquired, I was off to find perhaps the most important part of the morning’s requirements.  Caffeine.  Now equipped with a bacon roll and a cup of coffee, I was ready for the long but very exciting day of sessions ahead of me.

 

DDD North is somewhat larger than DDD East Anglia (although the latter will surely grow over time) so whereas DDD East Anglia had 3 parallel tracks of sessions, DDD North has 5!  This can frequently lead to difficulties in deciding which session to attend but it is really testament to the variety and quality of the sessions at DDD North.  So, having taken the difficult choices of which sessions to attend, I headed off the room for my first session.

 

20131012_092813The first session up was Phil Trelford’s F# Eye 4 the C# Guy.  This session was one of three sessions during the day dedicated to F#.  Phil’s session was aimed at developers currently using C# and he starts off by saying that, although F# offers some advantages over C#, there’s no “one true language” and it’s often the correct approach to use a combination of languages (both C# and F#) within a single application.  Phil goes on to talk about the number and variety of companies that are currently using and taking advantage of the features of F#.  F# was used within Halo 3 for the multi-player component which uses a self-improving machine learning algorithm to monitor, rate and intelligently match players with similar abilities together in games.  This same algorithm was also tweaked and later used within the Bing search engine to match adverts to search queries.  Phil also shares with us a quotation from a company called Kaggle who were previously predominantly a C# development team and who moved a lot of their C# code to F# with great success.  They said, that their F# was “consistently shorter, easier to read, easier to refactor and contained far fewer bugs” compared to the equivalent C# code.

 

Phil talks about the the features of the F# language next. It’s statically typed and multi-paradigm.  Phil states that it’s not entirely a functional language, but is really “functional first" and is also object-oriented.  It’s also completely open source!  Phil’s next step is to show a typical class in C#, the standard Person class with Name and Age properties:

 

public class Person
{
    private string _Name;
    private int _Age;

    public Person(string name, int age)
    {
        _Name = name;
        _Age = age;
    }

    public string Name
    {
        get { return _Name; }
        set { _Name = value; }
    }

    public int Age
    {
        get { return _Age; }
        set { _Age = value; }
    }

    public override string ToString()
    {
        return string.Format("{0} {1}", _Name, _Age);
    }
}

 

Phil’s point here is that although this is a simple class with only two properties, the amount of times that the word “name” or “age” is repeated is excessive.  Phil calls this the “Local Government Pattern” as everything has to be declared in triplicate! :)  Here’s the same class, with the same functionality, but written in F#:

 

namespace People

type Person (name, age) = 
    member person.Name = name
    member person.Age = age

    override person.ToString() = 
        sprintf "%s %d" name age

 

Much shorter, and with far less repetition.  But it can get far better than that.  Here’s the same class again (albeit minus the .ToString() override) in a single line of F#:

 

type Person = { Name: string, Age: int }

 

Phil continues his talk to discuss how, being a fully-fledged, first class citizen of a language in the .NET world, F# code and components can fully interact with C# components, and vice-versa.  F# also has the full extent of the .NET Framework at it’s disposal, too.  Phil shows some more F# code, this one being something called a “discriminated union”:

 

type Shape = 
      | Circle of float 
      | Square of float * float 
      | Rectangle of float 

I’d come across the discriminated unions before, but as an F# newbie, I only barely understood them.  Something that really helped me at least, as a C# guy, was when Phil explained the IL that is generated from the code.  In the above example, the Shape class is defined as an abstract base class and the Circle, Square and Rectangle classes are concrete implementations of the abstract Shape class!  Although thinking of these unions as base and derived classes isn’t strictly true when thinking of F# and it’s functional-style paradigm, it certainly helped me in mentally mapping something in F# back to the equivalent concept in C# (or a more OOP-style language).

 

Phil continues by mentioning some of the best ways to get up to speed with the F# language.  One of Phil’s favourite methods for complete F# newbies, is the F# Koans GitHub repository.  Based upon the Ruby Koans, this repository contains “broken” F# code that is covered by a number of unit tests.  You run the unit tests to see them fail and your job is to “fix” the broken code, usually by “filling in the blanks” that are purposely left there, thereby allowing the test to pass.  Each time you fix a test, you learn a little more about the F# syntax and the language constructs.  I’ve already tried the first few of these and they’re a really good mechanism for a beginner to use to get to grips with F#.  Phil states that he uses these Koans to train new hires in F# for the company he works for.  Phil also gives a special mention to the tryfsharp.org website which also allows newbies to F# to play with the language.  What’s special about tryfsharp.org is that you can try out the F# language entirely from within your web-browser, needing no other installed software on your PC.  It even contains full IntelliSense!

 

Phil’s talk continues with a discussion of meta-programming and F#’s “quotations”.  These are similar to C#’s Expressions but more powerful.  They’re a more advanced subject (and worthy of a talk all of their own no doubt) but effectively allow you to represent F# code in an expression tree which can be evaluated at runtime.  From here, we dive into BDD and testing of F# code in general.  Phil talks about a BDD library (his own, called TickSpec) and how even text-based BDD test definitions are much more terse within F# rather than the equivalent C# BDD definitions (See the TickSpec homepage for some examples of this).  Not only that, but Phil shows a remarkable ability to be able to debug his BDD text-based definitions within the Visual Studio IDE, including setting breakpoints, running the program in debug mode and breaking in his BDD text file!  He also tells a story of how he was able, with a full suite of unit and BDD tests wrapped around the code, to convert a 30,000+ line C# code base into a 200 line F# program that not only perfectly replicated the behaviour of the C# program, but was actually able to deliver even more – all within less than 1/10th of the lines of code!

 

Phil shows us his “Cellz” spread sheet application written in F# next.  He says it’s only a few hundred lines of code and is a good example of a medium sized F# program.  He also states that his implementation of the code that parses and interprets user-defined functions within the spread sheet “cell” is sometimes as little as one line of code!  We all ponder as to whether Excel’s implementations are as succinct! :)  As well as Cellz, there’s a number of other project’s of Phil’s that he tells us about.  One is a mocking framework, similar to C#’s Moq library, which of course, had to be called Foq!    There is also a “Mario” style game that we are shown that was created with the FunScript set of type providers allowing JavaScript to be created from F# code.  Phil also shows us a PacMan clone, running in the browser, created with only a few hundred lines of F# code.

 

Nearing the end of Phil’s talk, he shows us some further resources for our continued education, pointing out a number of books that cover the F# language.  Some specific recommendations are “Programming F#” as well as Phil’s own book (currently in early-access), “F# Deep Dives” which is co-authored by Tomas Petricek (whom I’d seen give an excellent talk on F# at DDD East Anglia).  Finally, Phil mentions that, although F# is a niche language with far fewer F# programmers than C# programmers, it’s a language that can command some impressive salaries! :)  Phil shows us a slide that indicates the UK average salary of F# programmers is almost twice that of a C# programmer.  So, there may not be as much demand for F# at the moment, but with that scarcity comes great rewards! :)

 

Overall, Phil’s talk was excellent and very enlightening.  It certainly helped me as a predominantly C# developer to get my head around the paradigm shift that is functional programming.  I’ve only scratched the surface so far, but I’m now curious to learn much more.

 

20131012_115002

After a quick coffee break back in the main hall of the campus (during which time I was able to snaffle a sausage baguette which had been left over from the morning breakfast!), I headed off to one of the largest rooms being used during the entire conference for my next session.  This one was Kendall Miller’sScaling Systems: Architectures That Grow.

 

 

Kendall opens his talk by saying that the entire session will be entirely technology agnostic.  He says that what he’s about to talk about are concepts that can apply right across the board and across the complete technology spectrum.  In fact, the concepts that Kendall is about to discuss regarding scalability in terms of how to achieve it and the things that can prevent you achieving it are not only technology agnostic, but they haven’t changed in over 30+ years of computing!

 

Kendall first asks, “What is scalability?”  Scaling is the ability for a system to cope under a certain demand.  That demand is clearly different for different systems.  Kendall shows us some slides that differentiate between the “big boys” such as Amazon, Microsoft, Twitter etc., who are scaling to anything between 30-60 million unique visitors per day and those of us mere mortals that only need to scale to a few thousand or even hundred users per day.  If we have a website that needs to handle 25,000 unique visitors per day, we can calculate that this is approximately 125,000 pages per day.  In the USA, there’s around 11 “high traffic” hours (these are the daytime hours, but spread across the many time zones of North America).  This gives us a requirement of around 12,000 pages/hour, and that divides down to only 3.3 pages per second.  This isn’t such a large amount to expect of our webserver and, in the grand scheme of things, is effectively “small fry” and should be easily achievable in any technology.  If we’re thinking about how we need our own systems to scale, it’s important to understand what we’re aiming for.  We may actually not need all that much scalability!  Scalability costs money, so we clearly don’t need to aim for scalability to millions of daily visitors to our site if we’re only likely to ever attract a few thousand.

 

We then ask, “What is availability?”  Availability is having a request being completed in a given amount of time.  It’s important to think about the different systems that this can apply to and the relative time that users of those systems will expect for a request to be completed.  For example, simply accessing a website from your browser is a request/response cycle that’s expected to be completed within a very short amount of time.  Delays here can turn people away from your website.  Contrast this with (for example) sending an email.  Here, it’s expected that email delivery won’t necessarily be instantaneous and the ability of the “system” in question to respond to the user’s request can take longer.  Of course, it’s expected that the email will eventually be delivered otherwise the system couldn’t be said to be “available”!

 

Regarding websites, Kendall mentions that in order to achieve scalability we need only concern ourselves with the dynamic pages.  Our static content should be inherently scalable in this day and age as scaling static content has long been a “solved problem”.  Geo-located CDN’s can help in this regard and have been used for a long time.  Kendall tells us that achieving scalability is simple in principle, but obviously much harder to implement in practice.  That said, once we understand the principles required for scalability, we can seek to ensure our implementations adhere to them.

 

There’s only 3 things required to make us scale.  And there’s only 1 thing that prevents us from scaling!

 

Kendall then introduces the 4 principles we need to be aware of:  ACD/C. 

 

This acronym is explained as Asynchronicity, Caching, Distribution & Consistency.  The first three are the principles which, when applied, give us scalability.  The last one, Consistency (or at least the need for our systems to remain in a consistent internal state) is the one that will stand in the way of scalability.  Kendall goes on to elaborate on each of the 4 principles, but he also re-orders them in the order in which they should be applied when attempting to implement scalability in a system that perhaps has none already.  We need to remember that scalability isn’t finite and that we need to ensure we work towards a scalability goal that makes sense for our application and it’s demands.

 

Kendall first introduces us to our own system’s architecture.  All systems have this architecture he says…!   Must admit, it’s a fairly popular one:

 

arch1

 

Kendall then talks about the principles we should apply, and the order in which we should apply them to an existing system in order to add scalability.

 

The first principle to add to a system is Caching.  Caching is almost always the easiest to start with to introduce some scalability in a system/application that needs it.  Caching is saving or storing the results of earlier work so that it can be reused at some later point in time.  After all, the very best performing queries are those ones that never have to be run!  Sometimes, caching alone can prevent around 99% of processing that really needn’t be done (i.e. a request for a specific webpage may well serve up the same page content over a long period of time, thus multiple requests within that time-scale can serve up the cached content).  Caching should be applied in front of everything that is time consuming and it’s easiest to apply in a left-to-right order (working from adding a cache in front of the web server, through to adding one in front of the application server, then finally the database server).

 

Once in place, the caches can use very simple strategies, as these can be incredibly effective despite their simplicity.  Microsoft’s Entity Framework uses a strategy that removes all cached entries as soon as a user commits a write (add/update/delete) to the database.  Whilst on the surface this may seem excessive to eradicate all of the cache, it’s really not as in the vast majority of systems, reads from the database outnumber writes by an order of magnitude.  For this reason, the cache is still incredibly effective and is still extensively re-used in real-world usage.  We’re reminded that applications ask lots of repeated questions.  Stateless applications even more so, but the answers to these questions rarely change.  Authorative information, such as the logged on user’s name, is expensive to repeatedly query for as it’s required so often.  Such information is the prime candidate to be cached.

 

An interesting point that Kendall makes here is to question the conventional wisdom that “the fewest lines of code is the fastest”.  He says that very often, that’s not really the case as very few lines of code in a method that is doing a lot of work implies that much of your processing is being off-loaded to other methods or classes that are doing your work for you.  This can often slow things down, especially if those other methods and/or classes are not specifically built to utilise cached data.  Very often, having more lines of code in a method can actually be the faster approach as your method is in total control of all of the processing work that needs to be done.  You’re doing all of the work yourself and so can ensure that the processing uses your newly cached data rather than expecting to have to read (or re-read it) from disk or database.

 

Distribution is the next thing to tackle after Caching.  Distribution is spreading the load around multiple servers and having many things doing your work for you rather than just one.  It’s important to note that the less state that’s held within your system, the better (and wider) you can distribute the load.  If we think of session state in a web application, such state will often prevent us from being able to fulfil user requests by any one of many different webservers.  We’re often in a position where we’ll require at least “Server Affinity” (also known as “sticky sessions”) to ensure that each specific user’s requests are always fulfilled by the same server in a given session.  Asynchronous code can really help here as it means that processing can be offloaded to other servers to be run in the background whilst the processing of the main work can continue to be performed in the foreground without having to wait for the response from the background processes.

 

Distribution is hardest when it comes to the database.  Databases, and indeed other forms of storage, are fundamentally state and scaling state is very difficult.  This is primarily due to the need to keep that state consistent across it’s distributed load.  This is the same consistency, or the requirement of consistency, that can hinder all manner of scalability and is one of the core principles.  One technique of scaling your storage layer is to use something called “Partitioned Storage Zones”.  These are similar to the server affinity (or sticky sessions) used on the web server when state needs to be maintained except that storage partitioning is usually more permanent.  We could have 5 separate database servers and split out (for example) 50 customers across those 5 database servers with 10 customers on each server.  We don’t need to synchronize the servers as any single given customer will only ever use the one server to which they’ve been permanently assigned.

 

After distribution comes Asynchronicity.  Asynchronicity (or Async for short) is always the hardest to implement and so is the last one to be attempted in order to provide scalability.  Async is the decoupling of operations to ensure that the minimum amount of work is performed within the “critical path” of the system.  The critical path is the processing that occurs to fulfil a user’s request end-to-end.  A user request to a web server for a given resource will require processing of the request, retrieval and processing of data before returning to the user.  If the retrieval and processing of data requires significant and time-consuming computation, it would be better if the user was not “held up” whilst waiting for the computation to complete, but for the response to be sent to the user in a more expedient fashion, with the results of the intensive computation delivered to the user at a later point in time.  Work should always be “queued” in this manner so that load is smoothed out across all servers and applications within the system.

 

One interesting Async technique, which is used by Amazon for their “recommendation” engine, is “Speculative Execution”.  This is some asynchronous processing that happens even though the user may never have explicitly requested such processing or may never even be around to see the results of such processing.  This is a perfectly legitimate approach and, whilst seemingly contrary to the notion of not doing any work unless you absolutely have to, “speculative execution” can actually yield performance gains.  It’s always done asynchronously so it’s never blocking the critical path of work being performed, and if the user does eventually require the results of the speculative execution, it’ll be pre-computed and cached so that it can be delivered to the user incredibly quickly.  Another good async technique is “scheduled requests”.  These are simply specific requests from the user for some computation work to be done, but the request is queued and the processing is performed at some later point in time.  Some good examples of these techniques are an intensive report generation request from the user that will have it’s results available later, or a “nightly process” that runs to compute some daily total figures (for example, the day’s financial trading figures).  When requested the next day, the previous day’s figures do not need to be computed in real-time at all and the system can simply retrieve the results from cache or persistent storage. This obviously improves the user’s perception of the overall speed of the system.  Amazon uses an interesting trick that actually goes against async in that they actually “wait” for an order’s email confirmation to be sent before displaying the order confirmation web page to the user.  It’s one of only a few areas of Amazon’s site that specifically isn’t async and is very intentionally done this way as the user’s perception of an order being truly finalized is of receiving the confirmation email in their inbox!

 

Kendall next talks about the final principle, which of the 4 principles is the one that actually prevents scalability, or at least complicates it significantly.  It’s the principle of Consistency.  Consistency is the degree to which all parties within the system observe some state that exists within the system at the same time.  Of course, the other principles of distribution and asynchronicity that help to provide scalability will directly impact the consistency of a system.  With this in mind, we need to recognize that scalability and scaling is very much about compromise.

 

There are numerous consistency challenges when scaling a system.  Singleton data structures (such as a numbering system that must remain contiguous) are particularly challenging as having multiple separate parts of a system that can generate the next number in sequence would require locking and synchronicity around the number generation in order to prevent the same number being used twice.  Kendall also talks about state that can be held at two separate endpoints of a process, such as a layer that reads some data from a database, and how this must be shared consistently – changes to the database after the data has been read must ideally be communicated to the layer that has previously read the data to be informed of the change.  Within the database context, this consistency extends to ensuring multiple database servers are kept consistent in the data that they hold and queries across partitioned datasets must be kept in sync.  All of these consistency challenges will cause compromise with the system, however, consistency can be achieved if the approach by the other 3 principles (Caching, Distribution & Async) are themselves implemented in a consistent manner and work towards the same goals.

 

Finally, Kendall discusses how we can actually implement all of these concepts within a real-world system.  The key to this is to test your existing system and gather as many timings and metrics as you possibly can.  Remember, scaling is about setting a realistic target that makes sense for your application.  Once armed with metrics and diagnostic data, we can set specific targets that our scalability must reach.  This could be something like, “all web pages must return to the user within 500ms”.  You would then start to implement, working from left to right within your architecture, and implementing the principles in the order of simplicity and which will provide the biggest return on investment. Caching first, then Distribution, finally Async.  But, importantly, when you hit your pre-defined target, you stop.  You’re done.

 

20131012_115812

After another coffee break back in the main hall, during which time I was able to browse through the various stalls set up by the conference’s numerous sponsors, chat with some of the folks running those stalls, and even grab myself some of the swag that was spread around, it was time for the final session before lunch.  This one was Matthew Steeples’You’ve Got Your Compiler In My Service”.

 

Matthew’s talk was about the functionality and features that the upcoming Microsoft Roslyn project will offer to .NET developers.  Roslyn is a “compiler-as-a-service”.  This means that the C# compiler offered by Roslyn will be available to be interacted with via other C# code.  Traditionally, compilers – and the existing C# compiler is no exception – are effectively “black boxes” and operate in one direction only.  Raw source code is fed in at one end, and after “magic” happening in the middle, compiled executable binary code came out from the other end.  In the case of the C# compiler, it’s actually IL code that gets output, ready to be JIT’ed by the .NET runtime.  But once that IL is output, there’s really no simple way to return from the IL back to the original source code.  Roslyn will change that.

 

Roslyn represents a deconstruction of the existing C# compiler.  It’s exposes all of the compiler’s functionality publically allowing a developer to use Roslyn to construct new C# code with C# code!  Traditional compilers will follow a series of steps to convert the raw text-based source code into something that the compiler can understand in order to convert it into working machine code.  These steps can vary from one compiler to another, but generally consist of a step to first breakdown the text into individual words and characters that can be further processed.  This step is known as “parsing”.  Next, the parsed text must be examined for language keywords that the compiler understands as being part of the language, as well as user-defined variable names and other tokens.  This is known as “lexical analysis”.  This is followed by “syntax analysis”, which is the understanding of (and verification against) the syntactical rules of the language.  Next comes the “semantic analysis” which is the checking of the semantics of the languages expression (for example, ensuring that the expression with an if statement’s condition evaluates to a boolean).  Finally, after all of this analysis, “code generation” can take place.

 

Roslyn, on the other hand, takes a different approach, and effectively turns the compiler of both the C# and VB languages into a large object model, exposing an API that programmers can easily interact with (For example: An object called “CatchClause” exists within the Roslyn.Compiler namespace that effectively represents the “catch” statement from within the try..catch block).

 

Creating code via Roslyn is achieved by creating a top-level object known as a Syntax Tree.  Syntax Trees contain a vast hierarchy of child objects, literally as a tree data structure and usually contain multiple Compilation Units (a compilation unit is a single class or “module” of code).  Each compilation unit, in turn, contains further objects and nodes that represent (for example) a complete C# class, starting with the class declaration itself including its scope and modifiers, drilling down the the methods (and their scoping and modifiers) and ultimately the individual lines of code contained within.  These syntax trees ultimately represent an entire C# (or VB!) program and can either be declared and created within other C# code, or parsed from raw text.  Specifically, Syntax Trees have three important attributes.  They represent all of the source code in full fidelity meaning every keyword, every variable name, every operator.  In fact, they’ll represent everything right down to the whitespace.   The second important attribute of a Syntax Tree is that, due to the first attribute, they’re completely reversible.  This means that code parsed from a raw text file into the SyntaxTree object model, is completely reversible back to the raw text source code.  The third and final attribute is that of immutability.  Once created, Syntax Trees cannot be changed.  This means they’re completely thread-safe.

 

Syntax Trees break down all source code into only three types of object.  Nodes, Tokens and Trivia.  Nodes are syntactic constructs of the language like declarations, statements, clauses and expressions.  Nodes generally also act as parent objects for other child objects and nodes within the Syntax Tree.  Tokens are the individual language grammar keywords but can also be identifiers, literals and punctuation.  Tokens have properties that represent (for example) their type (a token representing a string literal in code will have a property that represents the fact that the literal is of type string) as well as other meta-data for the token, but tokens can never be parents of other objects within the Syntax Tree.  Finally, trivia, is everything else within the source code and are primarily concerned with largely insignificant text such as whitespace, comments, pre-processor directives etc.

 

The following bit of C# code shows how we can use Roslyn to parse a literal text representation of a simple “Hello World” application:

 

var tree = SyntaxTree.ParseText(@"
    using System;
    namespace HelloRoslyn
    {
        class Program
        {
            static void Main(string[] args)
            {
                Console.WriteLine(""Hello World"");
            }
        }
    }
");

Once this code has been executed, the tree variable will hold a complete syntax tree that represents the entire program as defined in the string literal.  Once created, tree variable’s syntax tree can be executed (i.e. the “Hello World” program can be run), it can be turned into IL (Intermediate Language), or turned back into the same source code!

 

The following C# code is the equivalent of the code above, except that here we’re not just parsing from the raw source code text, we’re actually creating and building up the syntax tree by hand using the built-in Roslyn objects that represent the various facets of the C# language:

 

using System;
using Roslyn.Compilers.CSharp;

namespace HelloRoslyn
{
  class Program
  {
    static void Main()
    {
      string program = Syntax.CompilationUnit(
        usings: Syntax.List(Syntax.UsingDirective(name: Syntax.ParseName("System"))),
        members: Syntax.List<MemberDeclarationSyntax>(
          Syntax.NamespaceDeclaration(
            name: Syntax.ParseName("HelloRoslyn"),
            members: Syntax.List<MemberDeclarationSyntax>(
              Syntax.ClassDeclaration(
                identifier: Syntax.Identifier("Program"),
                members: Syntax.List<MemberDeclarationSyntax>(
                  Syntax.MethodDeclaration(
                    returnType: Syntax.PredefinedType(Syntax.Token(SyntaxKind.VoidKeyword)),
                    modifiers: Syntax.TokenList(Syntax.Token(SyntaxKind.StaticKeyword)),
                    identifier: Syntax.ParseToken("Main"),
                    parameterList: Syntax.ParameterList(),
                    bodyOpt: Syntax.Block(
                      statements: Syntax.List<StatementSyntax>(
                        Syntax.ExpressionStatement(
                          Syntax.InvocationExpression(
                            Syntax.MemberAccessExpression(
                              kind: SyntaxKind.MemberAccessExpression,
                              expression: Syntax.IdentifierName("Console"),
                              name: Syntax.IdentifierName("WriteLine"),
                              operatorToken: Syntax.Token(SyntaxKind.DotToken)),
                            Syntax.ArgumentList(
                              arguments: Syntax.SeparatedList(
                                Syntax.Argument(
                                  expression: Syntax.LiteralExpression(
                                    kind: SyntaxKind.StringLiteralExpression,
                                    token: Syntax.Literal("\"Hello world\"", "Hello world")
                                  )
                                )
                              )
                            )
                          )
                        )
                      )
                    )
                  )
                )
              )
            )
          )
        ));
    }
  }
}

Phew!  That’s quite some code there to create the Syntax Tree for a simple “Hello World” console application!  Although Roslyn can be quite verbose, and building up syntax trees in code can be incredibly cumbersome, the functionality offered by Roslyn is incredibly powerful.  So, why on earth would we need this kind of functionality?

 

Well, one current simple usage of Roslyn is to create a “plug-in” for the Visual Studio IDE.  This plug-in can interact with the source code editor window to dynamically interrogate the current user edited source and perform alterations.  These could be refactoring and code generation, similar to the functionality that’s currently offered by the ReSharper or JustCode tools.  Of course, those tools can perform a myriad of interactions with the code editor windows of the Visual Studio IDE, however they probably currently have to implement their own parsing and translation engine over the code that’s edited by the user.  Roslyn makes this incredibly easy to accomplish within your own plug-in utilities.  Other usages of Roslyn include the ability for an application to dynamically “inject” code into itself.  At this point Matthew shows us a demo of a simple Windows Forms application with a simple textbox on the form.  He proceeds to type out a C# class declaration into the form’s textbox.  He ensures that this class declaration implements a specific interface that the Windows Forms application already knows about.  Once entered, the running WinForms app can take the raw text from the textbox, and using Roslyn, convert this text into a Syntax Tree.  This Syntax Tree can then be invoked as actual code, as though it were simply a part of the running application.  In this case, Matthew’s example has an interface the defines a single “GetDate” method that returns a string.  Matthew types his class into the WinForms textbox and returns the current Date and Time in the current locale.  This is then executed and invoked by the running application and the result is displayed on the Form.  Matthew then shows how the code within the textbox can be easily altered to return the same Date and Time but in the UTC time zone.  One click of a button and the new code is parsed, interpreted and invoked using Roslyn to immediately show the new result on the Windows Form.

 

Roslyn, as a new C# compiler, is itself written in C#.  Some of the current complexities with the Roslyn toolkit is that the current C# compiler, which is written in C++, doesn’t entirely conform to the C# specification.  This makes it fairly tricky to reproduce the compiler in accordance with the C# specification, and the current dilemma is whether Roslyn should embrace the C# specification entirely (thus making it slightly incompatible with the existing C# compiler) or whether to faithfully reproduce the existing C# compiler’s behaviour even though it doesn’t strictly conform to the specification.

 

Matthew wraps up his talk with a summary of the Roslyn compiler’s abilities, which are extensive and powerful despite it still only being a CTP (Community Technology Preview) of the final functionality, and offers the link to the area on MSDN where you can download Roslyn and learn all about this new “compiler-as-a-service” which will, eventually, become a standard part of Visual Studio and C# (and VB!) development in general.

 

20131012_131316

After Matthew’s talk it was time for lunch.  Lunch at DDD North this year was just as great as last year.  We all wandered off to the main entrance hall where the staff of the venue were frantically trying to put out as many bags with a fantastic variety of sandwiches, fruit and chocolate bars as they could before the hoards of hungry developers came along to whisk them away.  The catering really was excellent as it was possible to pre-order specific lunches for those with specific dietary requirements, as well as ensuring there was a wide range of vegetarian options available too.

 

I examined the available options, which took a little while as I, too, have specific dietary requirements in that I’m a fussy bugger as I don’t like mayonnaise!  It took a little while to find a sandwich that didn’t come loaded with mayo, but after only a short while, I found one.  And a lovely sandwich it was too!  Along with my crisps, chocolate and fruit, I found a place to sit down and quietly eat my lunch whilst contemplating the quantity and quality of the information I’d learned so far.

 

20131012_131448

During the lunch break, there were a number of “grok talks” taking place in the largest of the lecture theatres that were being used for the conference (this was the same theatre where Kendall Miller had given his talk earlier).  Whilst I always try to take in at least one or two (if not all) of the grok talks that take place during the DDD (and other) conferences, unfortunately on this occasion I was too busy stuffing my face, wandering around the main hall and browsing the many sponsors stands as well as chatting away to some old and new friends that I’d met up with there.  By the time I realised the grok talks were talking place, it was too late to attend.

 

After an lovely lunch, it was time for the first of the afternoon’s sessions, one of two remaining in the day.  This session saw us gathering in one of the lecture halls only to find that the projector had decided to stop working.  The DDD volunteers tried frantically to get the thing working again, but ultimately, it proved to be a futile endeavour.  Eventually, we were told to head across the campus to the other building that was being used for the conference and to a “spare room”, apparently reserved for such an eventuality.

 

After a brisk, but slightly soggy walk across the campus forecourt (the weather at this point was fairly miserable!) we entered the David Goldman Informatics Centre and trundled our way to the spare room.  We quickly sat ourselves down and the speaker quickly set himself up as we were now running slightly behind schedule.  So, without further ado, we kicked off the first afternoon session which was MongoDB For C# Developers, given by Simon Elliston Ball.

 

Simon’s talk was an introduction to the MongoDB No-SQL database and specifically how we as C# developers can utilise the functionality provided by MongoDB.  Mongo is a document-oriented database and stores it’s data as a collection of key/value pairs within a document.  These documents are then stored together as collections within a database.  A document can be thought of as a single row in a RDBMS database table, and the collection of documents can be thought of as the table itself, finally multiple collections are grouped together as a database, however, this analogy isn’t strictly correct.  This is very different from the relational structure you can can find in today’s popular database systems such as Microsoft’s SQL Server, Oracle, MySQL & IBM’s DB2 to name just a few of them.  Document oriented databases usually store their data represented in JSON format, and in the case of MongoDB, it uses a flavour of JSON known as BSON which is Binary JSON.  An example JSON document could something as simple as:

 

{
    "firstName": "John",
    "lastName": "Smith",
    "age": 25
}

 

However, the same document could be somewhat more complex, like this:

 

{
    "firstName": "John",
    "lastName": "Smith",
    "age": 25,
    "address": {
        "streetAddress": "21 2nd Street",
        "city": "New York",
        "state": "NY",
        "postalCode": 10021
    },
    "phoneNumbers": [
        {
            "type": "home",
            "number": "212 555-1234"
        },
        {
            "type": "fax",
            "number": "646 555-4567"
        }
    ]
}

 

This gives us an ability that RDBMS database don’t have and that’s the ability to nest multiple values for a single “key” in a single document.  RDBMS’s would require multiple tables joined together by a foreign key in order to represent this kind of data structure, but for document-oriented databases, this is fairly standard.  Furthermore, MongoDB is a schema-less database which means that documents within the same collection don’t even need to have the same structure.  We could take our two JSON examples from above and safely store them within the exact same collection in the same database!  Of course, we have to be careful when we’re reading them back out again, especially if we’re trying to deserialize the JSON into a C# class.  Importantly, as MongoDB uses BSON rather than JSON, it can offer strong typing of the values that are assigned to keys.  Within the .NET world, the MongoDB client framework allows decorating POCO classes with annotations that will aid in the mapping between the .NET data types and the BSON data types.

 

So, given this incredible flexibility of a document-oriented database, what are the downsides?  Well, there are no joins within MongoDB.  This means we can’t join documents (or records) from one collection with another as you could do with different tables within a RDBMS system.  If your data is very highly relational, a document-oriented database is probably not the right choice, but but a lot of data structures can be represented by documents.  MongoDB allows an individual document to be up to 16MB in size, and given that we can have multiple values for a given key within the document, we can probably represent an average hierarchical data/object graph using a single document.

 

Simon makes a comparison between MongoDB and another popular document-oriented database, RavenDB.  Simon highlights how RavenDB, being the newer document-oriented database offers ACID-compliance and transactions that stretch over multi-documents.  He states that MongoDB’s transactions are only per document.  MongoDB’s replication supports a Master-Slave configuration, but Raven’s replication is Master-Master and that MongoDB supports being used from within many different languages with native client libraries for JavaScript, Java, Python, Ruby, .NET, Scala, Erlang and many more.  RavenDB is effectively .NET only (at least as far as native client libraries go) however RavenDB does offer a REST-based API and is thus callable from any language that can reach a URI.

 

Simon continues by telling us about how we can get to play with MongoDB as C# developers.  The native C# MongoDB client library is distributed as a NuGet package which is easily installable from within any Visual Studio project.  The NuGet package contains the client library which enables easy access to a MongoDB Server instance from .NET as well as containing types that provides the aforementioned annotations to decorate your POCO classes to enable easy mapping of your .NET types to the MongoDB BSON types.  Once installed, accessing some data within a MongoDB database can be performed quite easily:

 

var client = new MongoClient(connectionString);
var server = client.GetServer(); 
var database = server.GetDatabase("MyDatabase");
var collection = database.GetCollection("MyCollection");

 

One of the nice things with MongoDB is that we don’t have to worry about explicitly closing or disposing of the resources that we’ve acquired with the above code.  Once these objects fall out of scope, the MongoDB client library will automatically close the database connection and release the connection back to the connection pool.  Of course, this can be done explicitly too, but it’s nice to know that failure to do so won’t leak resources.

 

Simon explains that all of Mongo’s operations are as “lazy” as they possibly can be, thus in the code above, we’re only going to hit the database to retrieve the documents from “MyCollection” once we start iterating over the collection variable.  The code above shows a simple query that simply returns all of the documents within a collection.  We can compose more complex queries in a number of ways, but perhaps the way that will be most familiar to C# developers is with LINQ-style query:

 

var readQuery = Query<Person>.EQ(p => p.PersonID == 2);
Person thePerson = personCollection.FindOne(readQuery);

This style of query allows retrieving a strongly-typed “Person” object using a Lambda expression as the argument to the EQ function of the Query object.  The resulting configured query object is then passed to the .FindOne method of the collection to allow retrieval of one specific Person object based upon the predicate of the query.  The newer versions of MongoDB support most of the available LINQ operators and expressions and collections can easily be exposed to the client code as an IQueryable:

 

var query =
   from person in personCollection.AsQueryable()
   where person.LastName == "Smith"
   select person;

foreach (var person in query)
// ....[snip]....

 

We can also create cursors to iterate over an entire collection of documents using the MongoCursor object:

 

MongoCursor<Person> personCursor = personCollection.FindAll();
personCursor.Skip = 100;
personCursor.Limit = 10;

foreach(var person in personCursor)
// .....[snip]....

Simon further explains how Mongo’s Update operations are trivially simple to perform too, often merely requiring the setting of the object properties, and calling the .Save method against the collection, passing in the updated object:

 

person.LastName = "Smith";
personCollection.Save(person);

Simon tells us that MongoDB supports something known as “write concerns”.  This mechanism allows us to return control to our code only after the master database and all slave servers have been successfully updated with our changes.  Without these write concerns, control will return to our code before the changes have persisted across all database servers, returning control to our code after only the master server has been updated whilst the slaves continue to update asynchronously in the background.  Unlike most RDBMS systems, UPDATEs to MongoDB will, by default, only ever affect one document, and this is usually the first document that the update query finds.  If you wish to perform a multi document update, you must explicitly tell MongoDB to perform such an update.

 

As stated earlier, documents are limited to 16MB in size however MongoDB provides a way to store a large “blob” of data (for example, if you needed to store a video file) using a technology called GridFS.  GridFS sits on top of MongoDB and allows you to store a large amount of binary data in “chunks”, even if this data exceeds the 16MB document limit.  Large files are committed to the database with a simple command such as:

 

database.GridFS.Upload(filestream, "mybigvideo.wmv").

 

This will upload the large video file to the database, which will break down the file into many small chunks.  Querying and retrieving this data is as simple as retrieving a normal document, and the database and the database driver are responsible for re-combining all of the chunks of the file to allow you to retrieve the file correctly with no further work required on the developers behalf.

 

MongoDB supports GeoSpatial functionality which allows querying location and geographic data for results that are “near” or within a certain distance of a specific geographic location:

 

database = server.GetDatabase("MyDatabase");
var collection = database.GetCollection("MyCollection");
var query = Query.EQ("Landmarks.LandMarkType", new BsonString("Statue"));
double lon = 54.9117468;
double lat = -1.3737675;
var earthRadius = 6378.0; // km
var rangeInKm = 100.0; // km
var options = GeoNearOptions
              .SetMaxDistance(rangeInKm / earthRadius /* to radians */)
              .SetSpherical(true);
var results = collection.GeoNear(query, lat, lon, 10, options);

The above code sample would find all documents within the Landmarks collection that have a LandMarkType of Statue and which are also within 10 kilometres of our defined Latitude and Longitude position.

 

MongoDB also supports the ability to query and transform data using a “MapReduce”  algorithm.  MapReduce is a very powerful way in which a large set of data can be filtered, sorted (the “map” part) and summarised (the “reduce” part) using hand-crafted map and reduce functions.  These functions are written in JavaScript and are interpreted by the MongoDB database engine, which contains a full JavaScript interpreter and execution engine.  Using this MapReduce mechanism, a developer can perform many of the same kinds of complicated “grouping” and aggregation queries that RDBMS systems perform.  For example, the following sample query would iterate over the collection within the database and sum the count of documents, grouped together by the key:

 

var map =
    "function() {" +
    "    for (var key in this) {" +
    "        emit(key, { count : 1 });" +
    "    }" +
    "}";

var reduce =
    "function(key, emits) {" +
    "    total = 0;" +
    "    for (var i in emits) {" +
    "        total += emits[i].count;" +
    "    }" +
    "    return { count : total };" +
    "}";

var mr = collection.MapReduce(map, reduce);

Finally, Simon wraps up his talk by telling us about a Glimpse plug-in that he’s authored himself which can greatly help to understand exactly what is going on between the client-side code that talks to the MongoDB client library and the actual requests that are sent to the server, as well as being able to inspect the resulting responses.

 

After a short trip back across the campus to grab a coffee in the other building that contains the main entrance hall, as well as an array of crisps, chocolate and fruit (these were the “left-overs” from the lunch bags of earlier in the afternoon!) to keep us developers well fed and watered, I trundled back across the campus to the same David Goldman Informatics Centre building I’d been in previously to watch the final session of the day.  This session was another F# session (F# was a popular subject this year) called “You’ve Learned The Basics Of F#, What’s Next?” and given by Ian Russell.

 

The basis of Ian’s talk was to examine two specific features of F# that Ian thought offered a fantastic amount of productivity over other languages, and especially over other .NET languages.  These two features were Type Providers and the MailboxProcessor.

 

First up, Ian takes a look at Type Providers.  First introduced in F# 3.0, Ian starts by explaining that Type Providers provide type inference over third party data.  What this essentially means is that a type provider for something like (say) a database can give the F# IDE type inference over what types you’ll be working with from the database as soon as you’ve typed in the line of code that specifies the connection string!  Take a look at the sample code below:

 

open System.Linq
open Microsoft.FSharp.Data.TypeProviders
type SqlConnection =
    SqlDataConnection<ConnectionString = @"Data Source=.\sql2008r2;Initial Catalog=chinook;Integrated Security=True">

let db = SqlConnection.GetDataContext()

let table =
    query { for r in db.Artist do
    select r }

 

The really important line of code from the sample above is this one:

 

query { for r in db.Artist do

Note the db.Artist part.  There’s no type within the code that defines what artist is.  The FSharp Data Type Provider has asynchronously and in the background of the IDE quietly opened the SQL Server connection as soon as the connection string was specified in the code.  It’s examined the database referred to in the connection string and it has automatically generated the types base upon the tables and their columns within the database!

 

Ian highlights the fact that F#’s SQL Server type provider requires to mapping code to go from F# type in code to SQL Server entities.  The equivalent C# code using Entity Framework would be significantly more verbose.

 

Ian also shows how it’s easy to take the “raw” types captured by the type provider and wrap them up into a nicer pattern, in this case a repository:

 

type ChinookRepository () =
    member x.GetArtists () =
        use context = SqlConnection.GetDataContext()
        query { for g in context.Artist do
                select g }
        |> Seq.toList

let artists =
    ChinookRepository().GetArtists()

 

Ian explains how F# supports a “query” syntax that is very similar (but much better than) C# and LINQ’s query syntax, ie:

 

from x in y select new { TheID = x.Id, TheName = x.FirstName }

 

The reason that F#’s query syntax is far superior is that F# allow you to define your own query syntax keywords.  For example, you can define your own keyword, “top” which would implement “Select Top X” style functionality.  This effectively allows you to define your own DSL (Domain-Specific Language) within F#!

 

After the data type provider, Ian goes on to show us how the same functionality of early-binding and type inference to a third-party data source works equally well with local CSV data in a file.  He shares the following code with us:

 

open FSharp.Data

let csv = new CsvProvider<"500-uk.csv">()

let data =
    csv.Data
    |> Seq.iter (fun t -> printf "%s %s\n" t.``First Name`` t.``Last Name``)

 

This code shows how you can easily express the columns from the CSV that you wish to work with by simply specifying the column name as a property of the type.  The actual type of this data is inferred from the data itself (numeric, string etc.) however, you can always explicitly specify the types should you desire.  Ian also shows how the exact same mechanism can even pull down data from an internet URI and infer strong types against it:

 

open FSharp.Data

let data = WorldBankData.GetDataContext()

data.Countries.``United Kingdom``.Indicators.``Central government debt, total (% of GDP)``
|> Seq.maxBy fst

 

The above code shows how simple and easy it is to consume data from the World Bank’s online data store in a strong, type inferred way.

 

This is all made possible thanks to the FSharp.Data library which is available as a NuGet package and is fully open-source and available on GitHub.  This library has the type providers for the World Bank and Freebase online data sources already built-in along with generic type providers for dealing with any CSV, JSON or XML file.  Ian tells us about a type provider that’s currently being developed to generically work against any REST service and will type infer the required F# objects and properties all in real-time simply from reading the data retrieved by the REST service.  Of course, you can create your own type providers to work with your own data sources in a strongly-typed, eagerly-inferred magical way! 

 

After this quick lap around type providers, Ian moves on to show us another well used and very useful feature of F#, the MailboxProcessor.  A MailboxProcessor is also sometimes known as an “Agent” (this name is frequently used in other functional languages) and effectively provides a stateless, dedicated message queue.  The MailboxProcessor consists of a lightweight message queue (the mailbox) and a message handler (the processor).  For code interacting with the MailboxProcessor, it’s all asynchronous, code can post messages to the message queue asynchronously (or synchronously if you prefer), however, internally the MailboxProcessor itself will only process it’s messages in a strictly synchronous manner and in a strict FIFO (First in, First Out) order, one message at a time.  This helps to maintain consistency of the queue.  Due to the MailboxProcessor exposing it’s messages asynchronously (but maintaining strict synchronicity internally), we don’t need to acquire locks when we’re dealing with the messages going in or coming out.  So, why is the MailboxProcessor so useful?

 

Well, Ian shows us a sample chat application that consists of simply posting messages to a MailboxProcessor.  The entire functionality of the chat application is contained within a single type/class:

 

type ChatMessage =
  | GetContent of AsyncReplyChannel<string>
  | SendMessage of string

let agent = Agent<_>.Start(fun agent ->
  let rec loop messages = async {

    // Pick next message from the mailbox
    let! msg = agent.Receive()
    match msg with
    | SendMessage msg ->
        // Add message to the list & continue
        return! loop (msg :: messages)

    | GetContent reply ->
        // Generate HTML with messages
        let sb = new StringBuilder()
        sb.Append("<ul>\n") |> ignore
        for msg in messages do
          sb.AppendFormat(" <li>{0}</li>\n", msg) |> ignore
        sb.Append("</ul>") |> ignore
        // Send it back as the reply
        reply.Reply(sb.ToString())
        return! loop messages }
  loop [] )


agent.Post(SendMessage "Welcome to F# chat implemented using agents!")
agent.Post(SendMessage "This is my second message to this chat room...")

agent.PostAndReply(GetContent)

 

The code above creates a single type (ChatRoom) that encapsulates all of the functionality required to “post” and “receive” messages from a MailboxProcessor – effectively mimicking the back and forth chat messages of a chat room.  Further code shows how this can be exposed over a webpage by utilising a HttpListener with another type:

 

let root = @"C:\Temp\Demo.ChatServer\"
let cts = new CancellationTokenSource()

HttpListener.Start
("http://localhost:8082/", (fun (request, response) -> async {
  match request.Url.LocalPath with
  | "/post" ->
      // Send message to the chat room
      room.SendMessage(request.InputString)
      response.Reply("OK")
  | "/chat" ->
      // Get messages from the chat room (asynchronously!)
      let! text = room.AsyncGetContent()
      response.Reply(text)
  | s ->
      // Handle an ordinary file request
      let file =
        root + (if s = "/" then "chat.html" else s.ToLower())
      if File.Exists(file) then
        let typ = contentTypes.[Path.GetExtension(file)]
        response.Reply(typ, File.ReadAllBytes(file))
      else
        response.Reply(sprintf "File not found: %s" file) }),
   cts.Token)

cts.Cancel()

 

This code shows how an F# type can be written to create a server which listens on a specific HTTP address and port and accepts messages to URL endpoints as part of the HTTP payload.  These messages are stored within the internal MailboxProcessor and subsequently retrieved to display on the webpage.  We can imagine two (or more) separate users with the same webpage open in their browser’s and each person’s messages getting both echoed back to themselves as well as being shown on each other user’s browsers.

 

Ian has actually coded up such a web application, with a slightly nicer UI, and ends off his demonstrations of the power of the MailboxProcessor by firing up two separate browsers on the same machine (mimicking two different users) and showing how chat messages from one user instantly and easily appear on the other user’s browser.  Amazingly, there’s a minimum of JavaScript involved in this demo, and even the back-end code that maintains the list of users and the list of messages is no more than a few screens full!

 

Ian wrapped up his talk by recapping the power of both Type Providers and the MailboxProcessor, and how both techniques build upon your existing F# knowledge and make the consumption and processing of data incredibly easy.

 

20131012_170122

After Ian’s talk it was time for the final announcements of the day and the prize give away!  We all made our way back to the main building, and to the largest room, the Tom Cowie Lecture Theatre.

 

After a short while all of the DDD North attendees along with the speakers, and sponsors had assembled in the lecture theatre.  The main organiser of DDD North, Andy Westgarth, gave a short speech thanking the attendees and the sponsors.  I’d like to offer my thanks to the sponsors here also, because as Andy said, if it wasn’t for them there wouldn’t be a DDD North.  After Andy’s short speech a number of the sponsors took to the microphone to both offer their thanks to the organisers of the event and to give away some prizes!  One of the first was Rachel Hawley who had been manning the Telerik stand all day, and who lead the call for applause and thanks for Andy and his team.  After Rachel had given away a prize, Steve from Tinamous was up to thank everyone involved and to give away more prizes.  After Steve had given away his prize, Andy mentioned that Steve had generously put some money behind the bar for the after event Geek Dinner that was taking place later in the evening and that everyone’s first drink was on him!   Thanks Steve!

 

Steve was followed by representatives from the NDC Conference, a representative from Sage and various other speakers and sponsor staff, all giving a quick speech to thank to organisers and to state how much they’ve enjoyed sponsoring such a great community event as DDD North.

 

Of course, each of these sponsors had prizes to give away.  Each time, Andy would offer a bag of our feedback forms which we’d submitted at the end of each session and the sponsor would draw out a winning entry.  As is usual for me, I didn’t win anything, however, lots of people did and there were some great prizes on offer, including a stack of various books, some Camtasia software licenses along with a complete copy of Visual Studio Premium with MSDN!

 

After a final closing speech by Andy thanking everyone again and telling us that, although there’s no confirmed date or location for the next DDD North, it will definitely happen and it’ll be in a North-West location, as the intention is to alternate the location each time between a North East location and one in the North West in order to cover the entire “north” of England.

 

20131012_175549And with that, another fantastic DDD North event was over…   Except that it wasn’t.  Not quite yet!   Courtesy of Make It Sunderland and Sunderland Software City, they had agreed to host a “drinks reception” at the Sunderland Software City offices!  The organisers of DDD North had laid on a free bus transfer service for the short ride from Sunderland University to the location of the Sunderland Software City offices closer to Sunderland city centre.  Since I was in the car, I drove the short 10 minutes drive to the Sunderland Software City offices.  Of course, being in the car meant that my drinking was severely limited.

 

Around 80 of the 300+ attendees from DDD North made the trip to the drinks reception and we we’re treated with a small bar with two hand-pulled ales from the Maxim Brewery.  One was the famous Double Maxim and the other, Swedish Blonde.  Two fine ales and they were free all night long, for as long as the cask lasted (or at least for the 2 hours that the drinks reception event lasted)!

 

Being a big fan of real ales, it was at this point that I was kicking myself for having brought the car with me to DDD North.  I could have relatively easily taken the Metro train service from Newcastle to Sunderland, but alas, I was not to know this fantastic drinks reception would be so great or that there would be copious amounts of real-ale on offer.  In hindsight though, it was probably for the best that my ability to drink the endless free ale was curtailed!  :)

20131012_175818

 

I made my way to a comfy seating area and was joined by Phil Trelford who had given the first talk of the day that I attended and who I had been chatting with off and on throughout the day, and also Sean Newham.  Later we were joined by another guy who’s name I forget (sorry).  We chatted about various things and had a really fun time.  It was here that Phil showed us an F# Type Provider that himself and his friend had written in a moment of inspiration that mimics the old “Choose Your Own Adventure” style books from the 1980’s by offering up the entire story within the Visual Studio IDE!

 

Not only were we supplied with free drinks for the evening, we were also supplied with a seemingly endless amount of nibbles, Hors d'oeuvre and tiny desserts and cakes.  These were brought to us with such an alarming frequency and never seemed to end!   Not that I’m complaining… Oh no.  They were delicious, but there was a real fear that the sheer amount of these lovely nibbles would ruin everyone’s appetite for the impending Geek Dinner.

 

There’s a tradition at DDD events to have a “geek dinner” after the event where attendees that wish to hang around can all go to a local restaurant and have their evening dinner together.  I’d never been to one of these geek dinner’s before, but on this occasion, I was able to attend.  Andy had selected a Chinese buffet restaurant, the Panda Oriental Buffet, mainly because it was a very short walk from the Sunderland Software City offices, and also presumably because they use Windows Azure to host their website!

 

After the excellent drinks reception was finished, we all wandered along the high street in Sunderland city centre to the restaurant.  It took a little while for us all to be seated, but we were all eventually in and were able to enjoy some nice Chinese food and continue to chat with fellow geeks and conference attendees.  I managed to speak with a few new faces, some guys who worked at Sage in Newcastle, some guys who worked at Black Marble in Yorkshire and a few other guys who’d travelled from Leeds.

 

After the meal, and with a full belly, I bid goodbye to my fellow geeks and set off back towards my car which I’d left parked outside the Sunderland Software City offices to head back to what was my home for that weekend, my in-law’s place in Newcastle.  A relatively short drive (approx. 30-40 minutes) away.

 

And so ended another great DDD event.  DDD North 2013 was superb.  The talks and the speakers were superb, and Andy and his team of helpers had, once again, arranged a conference with superb organisation. So, many thanks to those involved in putting on this conference, and of course, thanks to the sponsors without whom there would be no conference.  Here’s looking forward to another great DDD North in 2014.    I can’t wait!

Mercurial Pushes and the authorization failure of doom!

$
0
0

 

imageEver get the notorious “abort: authorization failed” message from a Mercurial “push” command??

 

You are not alone!

 

It seems the way to fix this is to always ensure that the Bitbucket username is specified within the “remote repository” URL!

 

imagei.e. the remote repository URL should be something like “https://craigtp@bitbucket.org/craigtp/fizzbuzz” and NOT “https://bitbucket.org/craigtp/fizzbuzz” – See the screenshot to the right.  In this case, my username is “craigtp”, obviously, replace that with your own username for your own repository/Bitbucket account.

 

Bizarrely, this is still required to get around the “authorization failed” issue even though TortoiseHg will prompt you at runtime for both your Bitbucket username and password (if the username is not specified in the URL).  Once you’ve specified your username in the URL, you’re no longer prompted for it, only the password, however the “abort: authorization failed” issue will go away!

 

UPDATE:

I’ve done further digging since posting this article, and unfortunately, I’m not convinced that this is the exact answer to the authorization problem.  Doing the above certainly fixed my specific problem at the time I had it, in the very specific circumstances of my environment (which including accessing BitBucket via a rather fussy proxy server) however I’ve since been able to happily make Mercurial pushes to BitBucket without having to specify the username in the remote repo URL in different environments.  My research on this matter continues!

I’m now an MCPD!

$
0
0

MCPD-RGBEver since I passed my MCTS exam back in December last year, I’ve been hard at work doing further study and practice to attempt to pass the follow-up exam 70-576: PRO: Designing and Developing Microsoft SharePoint 2010 Applications.

 

Well, I finally took that exam earlier this week and I’m pleased to report that I’ve passed!  Once again, I managed to get 100% on the exam, although I really don’t know how I achieved such a high percentage score this time around as I was convinced I’d got answers to some of the questions wrong!  Again, I’m over the moon with my result!

 

Passing this exam has now earned me the distinguished certification of:

Microsoft Certified Professional Developer – SharePoint Developer 2010

 

In taking these two exams within the last couple of months, it seems I’ve stirred something inside of me that quite likes the idea of acquiring further Microsoft Certification.  For now, though, I think I’ll take a little break from the intense study that I’ve done of the SharePoint platform, although I’ll still be using SharePoint 2010 quite frequently.  I seem to have amassed a fair sized collection of other tech books in the intervening period that I really need to get on with reading!

JavaScript / jQuery IntelliSense in Visual Studio 2012

$
0
0

I blogged a while ago about a rather ugly and hacky way in which you could get the goodness of jQuery (and general JavaScript) IntelliSense in the Razor editor in Visual Studio 2010’s IDE.

 

This basically involved placing code similar to the following into every MVC view where you wanted IntelliSense to show up:

 

@* Stupid hack to get jQuery intellisense to work in the VS2010 IDE! *@
@if (false)
{<script src="../../Scripts/jquery-1.6.2-vsdoc.js" type="text/javascript"></script>
}

 

Well, since the release of Visual Studio 11 Beta, and the recent release of Visual Studio 2012 RC (Visual Studio 2012 is now the formal name of Visual Studio 11) we now no longer have to perform the above hack and clutter up our MVC views in order to enjoy the benefits of IntelliSense.

 

In Visual Studio 2012 (hereafter referred to as VS2012) this has been achieved by allowing an additional file to be placed within the solution/project which will contain a list of “references” to other JavaScript files that all MVC views will reference and honour.

 

The first step to configuring this is to open up VS2012’s Options dialog by selecting TOOLS > OPTIONS from the main menu bar:

 

image

 

Once there, you’ll want to navigate to the Text Editor > JavaScript > IntelliSense > References options:

 

image

 

The first thing to change here, is to select Implicit (Web) from the Reference Group drop-down list.  Doing this shows the list of references and included files within the Implicit (Web) group, as shown below the drop-down. Implicit (Web) includes a number of .js files that are included with VS2012 itself (and are located within your VS2012 install folder), but it also includes the following, project-specific entry:

 

~/Scripts/_references.js

 

Of course, this is all configurable, so you can easily add your own file in here in your own specific location or change the pre-defined _references.js, but since ASP.NET MVC is based around convention over configuration, let’s leave the default as it is!  Click OK and close the options dialog.

 

Now, what’s happened so far is that as part of the pre-defined Implicit (Web) reference group, VS2012 will look for a file called _references.js within a web project’s ~/Scripts/ folder (the ~/ implying the root of our web application) and use the references that are defined within that file as other files that should be referenced from within each of our MVC views automatically.

 

So, the next step is to add this file to one of our MVC projects in the relevant location, namely the ~/Scripts/ folder.  Right-click on the Scripts folder and select Add > New Item:

 

image

 

Once the Add New Item dialog is displayed, we can add a new JavaScript File, making sure that we name the file exactly as the default pre-defined Implicit (Web) reference group expects the file to be named:

 

image

 

 

The format of the _references.js file follows the JScript Comments Reference format that has been in Visual Studio since VS2008.  It’s shown below:

 

/// <reference path=”path to file to include” />

 

You can add as many or as few “references” within the _references.js file that you need.  Bear in mind, though, that the more files you add in here, the more it may negatively impact the performance of the editor/IDE as it’ll have far more files that it has to load and parse in order to determine what IntelliSense should be displayed.   A sample _references.js file is shown below:

 

image

 

The format/syntax of the references within this file can take a number of forms.  You can directly reference other JavaScript files without needing a path if they’re in the same folder as the _references.js file (as the example above shows):

 

/// <reference path="jquery-1.6.3.js" />

 

You can use relative paths which are relative from the folder where the_references.js file is located:

 

/// <reference path="SubfolderTest/jquery-1.6.3.js" />

 

And you can also use paths that are relative to your web project’s “root” folder by using the special ASP.NET ~ (tilde) syntax:

 

/// <reference path="~/Scripts/SubfolderTest/jquery-1.6.3.js" />

 

Once this is configured and saved, you will now have lovely IntelliSense within your MVC Views without needing additional hacky script references from within the view itself.  See the screen shot below:

 

image

 

Yep, that’s the entirety of the view that you can see there (no @if(false) nonsense!), and that’s lovely jQuery IntelliSense being displayed as soon as you start typing $( !

 

Update (10/Feb/2013): Changed the screenshots to use a nicer elliptical highlighting and tweaked some references ever-so-slightly (i.e. change jQuery from 1.6.2 to 1.6.3)

SSH over SSL with BitBucket & GitHub.

$
0
0

I’ve recently decided to switch to using SSH (Secure Shell) access for all of my repositories on both BitBucket & GitHub.  I was previously using HTTPS access, however this frequently means that you end up with hard-coded usernames and passwords inside your Mercurial and Git configuration files.  Not the most secure approach.

I switched over to using SSH Keys for access to both BitBucket and GitHub and I immediately ran into a problem.  SSH access is, by default, done over Port 22 however this is not always available for use.  In a corporate environment, or over public Wi-Fi, this port is frequently blocked.  Fortunately, both GitHub and BitBucket both allow using SSH over the port that is used for SSL (HTTPS) traffic instead as this is almost always never blocked (Both Port 80 (HTTP) and 443 (HTTPS) are required for web browsing).

Setting this up is usually easy enough, but there can be a few slightly confusing parts to ensuring your SSH Keys are entered in the correct format and making sure you’re using the correct URI to access your repositories.  I found BitBucket that little bit easier to configure, and initially struggled with GitHub.  I believe this is primarily because GitHub is more geared towards Unix and OpenSSH users rather than Windows and PuTTY users.

Setting Up The Keys

The first step is to ensure that the SSH Key is in the correct format to be added to either your GitHub or BitBucket account.  If you’re using PuTTYGen to generate your SSH keys, the easiest way it to simply copy & paste the key from the PuTTYGen window:

puttygen

In my own experience, I’ve found that BitBucket is slightly more forgiving of the exact format of the SSH Key.  I’d previously opened my private SSH Key files (.ppk file extension) in Notepad and copied and pasted from there.  When viewed this way, the SSH Key is rendered in an entirely different format as shown below:

puttygenkeytext

It seems that BitBucket will accept copying and pasting the “public” section from this file (identified as the section between the lines “Public-Lines: 6” and “Private-Lines: 14”) however GitHub won’t.  Copying and pasting from the PuTTYGen window, though, will consistently work with both BitBucket & GitHub.

Configuring Client Access

The next step is to configure your client to correctly access your BitBucket and GitHub repositories using SSH over the HTTPS/SSL port.  Personally, I’ve been using TortoiseHG for some time now for my Mercurial repositories, but recently I’ve decided to switch to Atlassian’s Sourcetree as it allows me to work with both Mercurial & Git repositories from the same UI.  (I’m fairly comfortable with Mercurial from the command line, too, but never really got around to learning Git from the command line.  Maybe I’ll come back to it one day!)

normalsshBitBucket has a very helpful page on their documentation that details the URI that you’ll need to use in order to correctly use SSH over Port 443.  It’s a bit different from the standard SSH URI that you get from the BitBucket repositories “home page”.

normalssh2

Note the altssh.bitbucket.org domain rather than the standard bitbucket.org one!  You’ll also need to add the port to the end of the domain as shown in the image.

Configuring the client access for GitHub was a little bit trickier.  Like BitBucket, GitHub has a page in their documentation relating to using SSH over SSL, however, this assumes you’re using the ssh command line tool, something that’s there by default in Unix/Linux but not there on Windows (although a 3rd party implementation of OpenSSH does exist).  The GitHub help page suggests to change your SSH configuration to “point” your ssh.github.com host name to run over Port 443.  That’s easy if you’re using the command line OpenSSH client, but if you’re using something like Tortoise or in my case, SourceTree, that’s not so easy.

ghnormalsshThe way to achieve this is to forget about fiddling with configuration files, and just ensure that you use the correct URI, correctly formed in order to establish a connection to GitHub with SSH over SSL.  The standard SSH URL provided by GitHub on any of your GitHub repository homepages (as shown in the image) suggests that the URL should follow this kind of format:

git@github.com:craigtp/craigtp.github.io.git

That’s fine for “normal” SSH access where SSH connects over the standard, default port of 22.  You’ll need to change that URL if you want to use SSH over the SSL port (Port 443).  The first thing to notice is that the colon within the URL above separates the domain from the username.  Ordinarily, colons in URLs separate the domain from the port number to be used, however here we’re going to add the port number separated by a colon from the domain and move the username part to be separated after the port number by a slash.  We also need to change the actual domain from git@github.com to git@ssh.github.com.

ghsshssl

Therefore our SSH over SSL URL becomes:

git@ssh.github.com:443/craigtp/craigtp.github.io.git

instead of:

git@github.com:craigtp/craigtp.github.io.git

(Obviously, replace the craigtp.github.io.git part of the URL with the relevant repository name!)

It’s a simple enough change, but one that’s not entirely obvious at first.


I’m now an MCSD in Application Lifecycle Management!

$
0
0

MCSD_2013(rgb)_1509Well, after previously saying that I’d would give the pursuit of further certifications a bit of a rest, I’ve gone and acquired yet another Microsoft Certification.  This one is Microsoft Certified Solutions Developer – Application Lifecycle Management.

It all started around the beginning of January this year when Microsoft sent out an email with a very special offer.  Register via Microsoft’s Virtual Academy and you would be sent a 3-for-1 voucher for selected Microsoft exams.  Since the three exams required to achieve a Microsoft Certified Solutions Developer – Application Lifecycle Management exams were included within this offer, I decided to go for it.  I’d pay for only the first exam and get the other two for free!

So, having acquired my voucher code, I proceeded to book myself in for the first of the 3 exams.  “Administering Visual Studio Team  Foundation Server 2012” was the first exam which I’d scheduled for the beginning of February.  Although I’d had some previous experience of setting up, configuring and administrating Team Foundation Server, that was with the 2010 version of the product.  I realised I needed to both refresh and update by skills.  Working on a local copy of TFS 2012 and following along with the “Applying ALM with Visual Studio 2012 Jumpstart” course on Microsoft’s Virtual Academy site, as well as studying with the excellent book, “Professional Scrum Development with Microsoft Visual Studio 2012” that is recommended as a companion/study guide for the MCSD ALM exams, I quickly got to work.

I sat and passed the first exam in early February this year.  Feeling energised by this, I quickly returned to the Prometric website to book the second of the three exams, “Software Testing with Visual Studio 2012”, which was scheduled for March of this year.  I’d mistakenly thought this was all about unit testing within Visual Studio, and whilst some of that was included in this course, it was really all about Visual Studio’s “Test Manager” product.  The aforementioned Virtual Academy course and the book covered all of the this course’s content, however, so continued study with those resources along with my own personal tinkering helped me tremendously.  When the time came I sat the exam and amazingly, passed with full marks!

So, with 2 exams down and only 1 to go, I decided to plough on and scheduled my third and final exam for late in April.  This final exam was “Delivering Continuous Value with Visual Studio 2012 Application Lifecycle Management” and was perhaps the most abstract of all of the exams, focusing on agility, project management and best practices around the “softer” side of software development.  Continued study with the aforementioned resources was still helpful, however, when the time came to sit the exam, I admit that I felt somewhat underprepared for this one.  But sit the exam I did, and whilst I ended up with my lowest score from all three of the exams, I still managed to score enough to pass quite comfortably.

So, with all three exams sat and passed, I was awarded the “Microsoft Certified Solution Developer – Application Lifecycle Management” certification.  I’ll definitely slow down with my quest for further certifications now….Well, unless Microsoft send me another tempting email with a very “special” offer included!

DDD East Anglia 2014 Review

$
0
0

DDD East Anglia Entrance Well, it’s that time of year again when a few DDD events come around.  This past Saturday saw the 2nd ever DDD East Anglia, bigger and better than last year’s inaugural event.

I’d set off on the previous night and stayed over on the Friday night in Kettering.  I availed myself of Kettering town centre’s seemingly only remaining open pub, The Old Market Inn (the Cherry Tree two doors down was closed for refurbishment) and enjoyed a few pints before heading back to my B&B.  The following morning, after a hearty breakfast, I set off on the approximately 1 hour journey into Cambridge and to the West Road Concert Hall, the venue for this year’s DDD East Anglia.

After arriving at the venue and registering, I quickly grabbed a cup of water before heading off across the campus to the lecture rooms and the first session of the day.

The first session is David Simner’s “OWIN, Katana and ASP.NET vNext – Eliminating the pain of IIS”.  David starts by summing up the existing problems with Microsoft’s IIS Server such as its cryptic error messages when simply trying to create or add a new website through to differing versions with differing support for features on differing OS versions.  e.g. Only IIS 8+ supports WebSockets, and IIS8 requires Windows 8 - it can’t be installed on lower versions of Windows.

David continues by calling out “http.sys” - the core of servicing web requests on Windows.  It’s a kernel-space driver that handles the request, looks at the host headers, url etc. and then finds the user space process that will then service the request.  It’s also responsible for dealing with the cryptography layer for SSL packets.  Although http.sys is the “core” of IIS, Microsoft has opened up http.sys to allow other people to use it directly without going through IIS.

David mentions how some existing technologies already support “self-hosting” meaning they can service http requests without requiring IIS. These technologies include WebAPI, SignalR etc., however, the problem with this self-hosting is that these technologies can’t interoperate this way.  Eg. SignalR doesn’t work within WebAPI’s self-hosting.

David continues by introducing OWIN and Katana.  OWIN is the Open Web Interface for .NET and Katana is a Microsoft implementation of OWIN.  Since OWIN is open and anyone can write their own implementation of it, this opens up the entire “web processing” service on Windows and allow us to both remove the dependence on IIS as well as have many differing technologies easily interoperate within the OWIN framework.  New versions of IIS will effectively be OWIN “hosts” as well as Katana being an OWIN host.  Many other implementation written by independent parties could potentially exist, too.

David asks why we should care about all of this, and states that OWIN just “gets out your way” - the framework doesn’t hinder you when you’re trying to do things.  He says it simply “does what you want” and that it does this due to it’s rich eco-system and community providing many custom developments for hosts, middleware, servers and adapters (middleware is the layer that provides a web development framework, i.e. ASP.NET MVC, NancyFX etc. and an adapter is things like System.Web etc. which serves to pass the raw data from the request coming through http.sys to the middleware layer.)

20140913_101244_LLS The 2nd half of David’s talk is a demo of writing a simple web application (using VS 2013) that runs on top of OWIN/Katana.  David creates a standard “Web Application” in VS2013, but immediately pulls in the Nuget package OwinHost (This is actually Katana!).  To use Katana, we need a class with the “magic” name of “Startup” which Katana looks for at startup and runs it.  The Startup class has a single void method called Configuration that takes an IAppBuilder argument, this method runs once per application run and exists to configure the OWIN middleware layer.  This can include such calls as:

app.UseWecomePage(“/”); 
app.UseWebApi(new HttpConfiguration(blah blah configure WebAPI etc.); 
app.Use<[my own custom class that inherits from OwinMiddleware]>();

David starts with writing a test that checks for access to a non-existent page and ensure it returns a 404 error.  In order to perform this test, we can use a WebApp.Start method (which is part of the Microsoft.Owin.Hosting – This is the Katana implementation of an OWIN Host) and allows the test method to effectively start the web processing “process” in code.  The test can then perform things like:

var httpClient= new Httpclient(); 
var result = httpclient.GetAsync(“http://localhost:5555”); 
Assert.Equal(result.StatusCode, 404);

Using OWIN in this way, though, can lead to flaky tests due to how TCP ports work within Windows and the fact that even when the code has finished executing, it can be a while before windows will “tear down” the TCP port allowing other code to re-use it.  To get around this, we can use another Nuget package, Microsoft.OWIN.Testing, which allows us to effectively bypass sending the http request to an actual TCP port and process it directly in memory.  This means our tests don’t even need to use an actual URL!

David shows how easy it is to write your own middleware layer, which consists of his own custom class (inheriting from OwinMiddleware) which contains a single method that invokes the next “task” in the middleware processing chain, but then returns to the same method to check that we didn’t take too long to process that next method.  (This is easily done as each piece of middleware processing is an async Task allowing us to do things like:

context.Invoke(next middleware processing method).ContinueWith(_ => LogIfWeTookTooLong(context));

Ultimately, the aim with OWIN and Katana, is to make EVERTHING X-copy-able.  Literally no more installing or separately configuring things like IIS.  It can all be done within code to configure your application, which can then be simply x-copy’d from one place to another.

 

  20140913_103920_LLSThe next session up is Pete Smith’s“Beyond Responsive Design – UI for the Modern Web Application”

Pete starts by reminding us how we first built web applications for the desktop, then the mobile phone market exploded and we had to make our web apps work well on mobile phones, each of which had their own screen sizes/resolutions etc.  Pete talks about how normal desktop designed web apps don’t really look well on constrained mobile phone screens.  We first tried to solve it with responsive design, but that often leads to having to support multiple code bases, one for desktop and one for mobile.  Pete says that there’s many problems with web apps.  What do we do with all the screen space on a big desktop screen?  There’s no real design guidelines or principles. 

Pete starts to look at design paradigms on mobile apps and shows how menus work on Android using the Hamburger button that allows a menu to slide out from the side of the screen.  This is doable due to Android devices often having fairly large screens for a mobile device.  However, the concept of menus on iPhones (for example), where the screen is much narrower, don’t slide out (from the side of the screen) but rather slide up from the bottom of the screen.  Pete continues through other UI design patterns like dialogs, header bars and property sheets and how they exist for the same reasons, but are implemented entirely differently on desktops and each different mobile device.  Pete states that some of these design patterns work well, such as hamburger menus, and flyout property sheets (notifications), however, some don’t work so well, such as dialogs that purposely don’t fill the entire mobile device screen, but keep a small border around the dialog.  Pete says that screen real estate is at a premium on a mobile device, so why intentionally reserve a section of the screen that’s not used?

The homogenous approach to modern web app development is to use design patterns that work well on both desktop devices as well as mobile devices.  Pete uses the new Azure portal with its concept of “blades” of information that flyout and stack horizontally, but scroll vertically independently from each other.  This is a design paradigm that works well on both the desktop as well as translating well to mobile device “pages” (think of how android “pages” have header bars that have back and forward buttons).

Pete that shows us a demo of a fairly simple mock-up of the DDD East Anglia website and shows how the exact same design patterns of a hamburger menu (that flies in from the left) and “property sheets” that fly in from the right (used for speaker bio’s etc.) work exactly the same (with responsive design for the widths etc.) on both a desktop web app and on mobile devices such as an iPad.

20140913_113421_LLS Pete shows us the code for his sample application, showing some LESS stylesheets, which he says are invaluable for laying out an application like this as the actual page layout is best achieved by absolutely positioning many of the page elements (the hamburger menu, the header bar, the left-hand menu etc.) using LESS mixins.  The main page uses HTML5 semantic markup and simply includes the headerbar and the menu icons on it, the left-hand menu (that by default is visible on devices with an appropriate width) and an empty <main> section that will contain the individual pages that will be loaded dynamically with JavaScript.

Pete finalises by showing a “full-blown” application that he’s currently writing for his client company to show that this set of design paradigms does indeed scale to a complete large application!  Pete is very passionate about bringing a comprehensive set of working design guidelines and paradigms to the wider masses that he’s started his own open working group to do this, called OWAG – The Open Web Apps Group.  They can be found at:  http://www.github.com/owag

 

20140913_120744_LLS The next session is Matt Warren’s “Performance is a feature!” which tells us that performance of our applications is a first-class feature which should be treated the same as usability and all other basic functionality of our application.  Performance can be applied at every layer of our application from the UI right down to the database or even the “raw metal” of our servers, however, Matt’s talk will focus on extracting the best performance of the .NET CLR (Common Language Runtime) – Matt does briefly touch upon the raw metal, which he calls the “Mechanical Sympathy” layer and mentions to look into the Disruptor pattern which allows certain systems (for example, high frequency trading applications) to scale to processing many millions of messages per second!

Matt uses Stack Overflow as a good example of a company taking performance very seriously, and cites Jeff Atwood’s blog post, “Performance is a feature”, as well as some humorous quotations (See images) as something that can provide inspiration to for improvement.20140913_120734_LLS

Matt starts by asking Why does performance matter?, What do we need to know? and When do we need to optimize performance?

The Why starts by stating that it can save us money.  If we’re hosting in the cloud where we pay per hour, we can save money by extracting more performance from fewer resources.  Matt continues to say that we can also save power by increasing performance (and money too as a result) and furthermore, bad performance can lead to broken applications or lost customers if our applications are slow.

Matt does suggest that we need to be careful and land somewhere in the middle of the spectrum between “optimizing everything all the time” (which can back us into a corner) versus “don’t optimize anything” (the extreme end of the “performance optimization is the root of all evil” approach).  Matt mentions various quotes by famous software architects, such as Rico Mariani from Microsoft who states “Never give up your performance accidentally”.

Matt continues with the “What”.  He starts by saying that “averages are bad” (such as “average response time”), we need to look at the edge cases and the outlier values.  We also need useful and meaningful metrics and numbers around how we can measure our performance.  For web site response times, we can say that most users should see pages load in 0.5 to 1.5 seconds, and that almost no-one should wait longer than 3 seconds, however, how do we define “almost no-one”.  We need absolute numbers to ensure we can accurately measure and profile our performance.  Matt also states that there’s a known fact that if only 1% of pages take (for example) more than 3 seconds to load, much more than 1% of users will be affected by this!

Matt continues with the When?  He says that we absolutely need to measure our performance within our production environment.  This is totally necessary to ensure that we’re measuring based upon “real-world” usage of our applications and everything that entails. 

20140913_123553_LLS Matt talks about the How? of performance.  It’s all about measuring.  Measure, measure, measure!  Matt mentions the Stack Overflow developed “MiniProfiler” for measuring where the time is spent when rendering a complete webpage as well as OpServer, which will profile and measure the actual servers that serve up and process our application.  Matt talks about micro-benchmarking which is profiling small individual parts of our code, often just a single method.  He warns to be careful of the GC (Garbage collector) as this can and will interfere with our measurements and shows some code involving forcing a GC.Collect() before timing the code (usually using a Stopwatch instance) which can help.  He states that allocations (of memory) is cheap but cleaning up after memory is released, isn’t.  Another tool that can help with this is Microsoft’s “PerfView” tool which can be run on the server and will show (amongst lots of other useful information) how and where the Garbage Collector is being called to clean up after you.

Matt finishes up by saying that static classes, although often frowned upon for other reasons, can really help with performance improvements.  He says to not be afraid to write your own tools, citing Stack Overflow’s “Dapper” and “Jil” tools to perform their own database access and JSON processing, which has been, performance-wise, far better for them than other similar tools that are available.  He says the main thing, though, is to “know your platform”.  For us .NET developers, this is the CLR, and understanding its internals on a fundamental and deep level is essential for really maximizing the performance of our own code that runs on top of it.  Matt talks, finally, about how the team at Microsoft learned a lot of performance lessons when building the Roslyn compiler and how some seemingly unnecessary code can greatly help performance.  One example was a method writing to a log file and that adding .ToString() to int values before passing to the logger can prevent boxing of the values, thus having a beneficial knock-on effect on the Garbage Collector.

 

20140913_130008_LLS After Matt’s talk it was time for lunch.  As is the custom at these events, lunch was the usual brown-bag affair with a sandwich, a packet of crisps, some fruit and a bottle of water.  There were some grok talks happening over lunch in the main concert hall, and I managed to catch one given by Iris Classon on Windows Universal application development which is developing XAML based applications for both Windows desktop and Windows Phone.

 

 

20140913_145501_LLS After lunch is Mark Rendle’s “The vNext Big Thing – ASP.NET shrinks down and grows up”.  Mark’s talk is all about the next version of ASP.NET that is currently in development at Microsoft.  The entire redevelopment is based around slimming down ASP.NET and making the entire framework as modular and composable as possible.  This is largely as a response to other web frameworks that already offer this kind of platform, such as NodeJs.  Mark even calls it NodeCS!

Mark states that they’re making a minimalist framework and runtime and that it’s all being developed as fully open source.  It’s built so that everything is shippable as a Nuget package, and it’s all being written to use runtime compilation using the new Roslyn compiler.  One of the many benefits that this will bring is the ability to “hot-swop” components and assemblies that make up a web application without ever having to stop and re-start the application!  Mark gives the answer to “Why are Microsoft doing this?” by stating that it’s all about helping versioning of .NET frameworks, making the ASP.NET framework modular, so you only need to install the bits you need, and improving the overall performance of the framework.

The redevelopment of ASP.NET starts with a new CLR.  This is the “CoreCLR”.  This is a cut-down version of the existing .NET CLR and strips out everything that isn’t entirely necessary for the most “core” functions.  There’s no “System.Web” in the ASP.NET vNext version.  This means that there’s no longer any integrated pipeline and it also means that there’s no longer any ASP.NET WebForms!

As part of this complete re-development effort, we’ll get a brand new version of ASP.NET MVC.  This will be ASP.NET MVC 6.  The major new element to MVC 6 will be the “merging” of MVC and WebAPI.  They’ll now be both one and the same thing.  They’ll also be built to be very modular and MVC will finally become fully asynchronous just as WebAPI has been for some time already.  Due to this, one interesting thing to note is that the ubiquitous “Controller” base class that all of our MVC controllers have always inherited from is now entirely optional!

Mark continues by taking a look at another part of the complete ASP.NET re-boot.  Along with new MVC’s and WebAPI’s, we’ll also get a brand new version of the Entity Framework ORM.  This is Entity Framework 7 and most notable about this is that the entire notion of database first (or designer-driven) database mapping is going away entirely!  It’s code-first only!  There’ll also be no ADO.NET and Entity Framework will now finally feature first-class support for non-SQL databases (i.e. NoSQL/Document databases, Azure Tables).

The new version of ASP.NET will bring with it lots of command line tooling, and there’s also going to be first class support for both Mac and Linux.  The goal, ala NodeJS, is to be able to write your entire application in something as simple as a text editor, with all of the application and configuration code in simple text-based code files.  Of course, the next version of Visual Studio (codenamed, Visual Studio 14) will have full support for the new ASP.NET platform.  Mark also details how the configuration of ASP.NET vNext developed applications will no longer use XML (or even a web.config).  They’ll use the currently popular JSON format instead inside of a new “config.json” file.

Mark proceeds by showing us a quick demo of the various new command line tools which are all named starting with the letter K.  There’s KVM, which is the K Version Manager and is used for managing different versions of the .NET runtime and framework.  Then there is KPM which is the K Package Manager, and operates similar to many other package managers, such as NodeJS’s “npm”, and allows you to install packages and individual components of the ASP.NET stack.  The final command line tool is K itself.  This is the K Runtime, and its command line executable is simply called “K”.  It is a small, lightweight process that is the runtime core of ASP.NET vNext itself. 

Mark then shows us a very quick sample website that consists of nothing more than 2-3 lines of JSON configuration, only 1 line of real actual code (a call to app.UseStaticFiles() within the Startup class’s “Configure” method) and a single file of static html and the thing is up and running, writing the word “Hurrah” to the page.  The Startup.cs class is effectively a single class replacement for the entire web.config and the entire contents of the App_Start folder!   The Configure method of the Startup class is effectively a series of calls to various .UseXXX methods on the app object:

app.UseStaticFiles(); 
app.UseEntityFramework().AddSqlServer(); 
app.UseBrowserLink(); 
etc.

Mark shows us where all the source code is. It’s all right there on public GitHub repositories and the current compiled binaries and packages can be found on myget.org.  Mark closes the talk by showing the same simple web app from before, but now demonstrating that this web app, written using the “alpha” bits from ASP.NET vNext can be run on an Azure website instance quite easily.  He commits his sample code to a GitHub repository that is linked to auto-deploy to a newly created Azure website and lets us watch as Azure pulls down all the required NuGet packages and eventually compiles his simple web application is real-time and spins up the website in his browser!

 

20140913_155842_LLS The final talk of the day is Barbara Fusinska’s “Architecture – Why so serious?” talk. This talk is about Barbara’s belief that all software developers should be architects too.  She starts by asking “What is architecture?”.  There are a number of answers to this question, depending upon who you ask.  Network distribution, Software Components, Services, API’s, Infrastructure, Domain Design.  All of these and more can be a part of architecture. 

Barbara says her talk will be given by showing a simple demo application called “Let’s go out” which is a simple scheduler application.  She will show how architecture has permeated all the different parts of the application.  Barbara starts with the “basics”.  She broaches the subject of application configuration and says how it’s best to start as you mean to go on by using an Ioc Container to manage the relationships and dependencies between objects within the application.

She continues by saying that one of the biggest and most fundamental problems of virtually all applications is how to pass data between the code of our application and the database, and vice-versa.  She mentions ORM’s and suggests that the traditional large ORM’s are often far too complicated and can frequently bog us down with complexity.  She suggests that the more modern Micro-ORM’s (of which there are Dapper, PetaPOCO& Massive amongst others) offer a better approach and are a much more lightweight layer between the code and the data.  Micro-ORM’s “bring SQL to the front” which is, after all, what we use to talk to our database.  Barbara suggests that it’s often better to not attempt to entirely abstract the SQL away or attempt to hide it too much, as can often happen with a larger, more fully-featured ORM tool.  On the flip-side, Barbara says that full-blown ORMs will provide us with an implicit unit of work pattern implementation and are better suited to Domain driven design within the database layer.  For Barbara’s demo application, she uses Mark Rendle’s Simple.Data micro-ORM.

Barbara says that the Repository pattern is really an anti-pattern and that it doesn’t really do much for your application.  She talks about how repositories often will end up with many, many methods that are effectively doing very similar things, and are used in only one place within our application.  For example, we often end up with “FindCustomersByID”, “FindCustomersByName”, “FindCustomerByCategory” etc. that all effectively select data from the customers database table and only differ by how we filter the customers.

Barbara shows how her own “read model” is a single class that deals with only reading data from the database and actually lives very close to the code that will use it, often an MVC controller action.  This is similar to a CQRS pattern and the read model is very separate and distinct from the domain model.  Barbara shows how she uses a “command pattern” to provide the unit of work and the identity pattern for the ORM.  Barbara talks about the Services within her application and how these are very much all based upon the domain model.  She talks about only exposing a method to perform some functionality, rather than exposing properties for example.  This not just to the user, but to other programmers who might have access to our classes.  She makes the property accessors private to the class and only allows access to them via a public method.  She shows how her application allows moving a schedule entry, but the business rules should only allow it to be moved forward in time.  Exposing DateTime properties would allow setting any dates and times, including those in the past and thus violating the domain rules.  By only allowing these properties to be set via a public method, which performs this domain validation, the setting of the dates and times can be better controlled.

Barbara says that the Command pattern is actually a better approach than using Services as they can greatly reduce dependencies within things like MVC Controllers.  Rather than having dependencies on multiple services like this:

public void MyCustomerOrderController(ICustomerService customerService, IOrderService orderservice, IActivityService activityService)
{
   ...
}

Where this controller’s purpose is to provide a mechanism to work with Customers, the orders placed by those customers and the activity on those orders.  We can, instead, “wrap” these services up into commands.  These commands will, internally, use multiple services to implement a single domain “command” like so:

public void MyCustomerOrderController(IAddActivityToCustomerOrderCommand addActivityCommand)
{
   ...
}

Providing a single domain command to perform the specific domain action.  This means that the MVC Controller that’s used for the UI that allows customers to be added to activities only has one dependency, the Command class itself.

 

20140913_163951_LLS With the final session over, it was time to head back to the main concert hall to wrap up the days proceedings, thank all those who were involved in the event and to distribute the prizes, generously donated by the various event sponsors.  No prizes for me this time around, although some very lucky attendees won quite a few prizes each!

After the wrap up there was a drinks reception in the same concert hall building, however, I wasn’t able to attend this as I had to set off on the long journey back home.  It was another very successful DDD event, and I can’t wait until they do it all over again next year!

DDD North 2014 In Review

$
0
0

Outside the Entrance This past Saturday, 18th October 2014, saw another DDD (Developer, Developer, Developer) event.  This one was the 4th annual DDD North event, this year held at the University Of Leeds.

Communal Area After arriving and signing in, I proceeded through the corridors to the communal area where we were all greeted with a cup of coffee (or tea) and a nice Danish pastry!  It’s always a nice surprise to get a nice cake with your morning coffee, so although I wasn’t really hungry as I’d recently eaten a large breakfast, I decided that a Danish Pastry covered in sweet, sweet icing was too much of a temptation to be able to refuse!Danish Pastries  After this delightful breakfast, I headed down the corridor for the first of the day’s sessions.

The first session of the day is Liam Westley’s“An Actor’s Life For Me” which talks about parallel processing with multiple threads using the Task Parallel Library and utilising the Actor Model.  Liam introduces the Actor model and states it was first described by Carl Hewitt as early as 1973.  The dilemma we have for parallel processing is due to shared state, causing us to lock around areas of memory where multiple threads may try to access that state.  The Actor model solves this by not having shared state within the system, instead having each process take stateless data that is not shared and outputting stateless data to the next process in the processing pipeline.  Liam uses an analogy of making a cup of tea and the steps involved in that whilst also getting an itch that needs scratching whilst making that cup of tea.  The itch (and thus the scratch) can happen during any of the tea-making steps, thus increasing the combinations of how alternating between making tea and scratching can grow exponentially.Liam Westley's Actor Pattern

Liam talks about how CPU’s have been multi-threaded and multi-core for many years now, first arriving around the same time as .NET v1.0, whilst in the same time frame, our developer tools haven’t really kept up.  .NET 1.0 pretty much gave us raw access to how windows handles threads using the TheadPool, which meant managing multiple threads and sharing state between them was very difficult.  .NET 2.0 gave us a SynchronizationContext, but multi-threaded programming was still very hard.  Eventually, we got the much simplified Async & Await keywords, but now we have the Task Parallel Library which provides us with the Actor pattern.  This basically allows us to write our code in individual “blocks” which are essentially black boxes sharing no state with any other block.  We can then chain these blocks together into a processing pipeline, giving us the ability to perform some computational process without sharing state.

Liam then shows us a demo of a console application which produces an MD5 hash for a number of large files in a folder.  The first  iteration of the demo shows this happening without using the Task Parallel Library (TPL) and so performs no parallel processing and simply processes each file, one at a time on a single thread, taking some time to complete.  The second iteration Liam shows us uses the TPL, but still only works in a single-threaded manner by wrapping the hash calculation function as a TPL ActionBlock.  This iteration does the same as the single-threaded version, as again, no parallel processing is occurring.  The final iteration runs in a multi threaded manner by simply setting the block configuration (ExecutionDataFlowBlockOptions) property of MaxDegreeOfParallelism.  What’s really amazing about these ActionBlocks is that they inherently and implicitly handle all input and output buffering and queuing by themselves. This means we can add many blocks into the processing pipeline at a faster rate than they can be executed, and the TPL will handle the queuing for us.

20141018_095624 Liam next talks about separating the processing and calculating of the file hashes by performing these in a TransformBlock rather than an ActionBlock, and only using ActionBlocks to print the hash value to the UI.  The output of the TransformBlock (the hash value and the filename) is passed to the ActionBlock in the processing pipeline.

Liam then introduces the BufferBlock.  This acts as a propagator between other blocks and a FIFO queue of data.  Liam talks about how, in our example, we can add a BufferBlock in front of all of the TransformBlocks which will effectively evenly distribute the “load” as we provide the files to be hashed between the TransformBlocks. 

Next, Liam shows how we can use the LinkTo method which allows us to filter the passing of blocks along the processing pipeline, as the LinkTo method allows us to pass a predicate to perform the filtering.  This could be used (for example) to hash files of different types by different TransformBlocks (i.e. an MP3 file is processed differently than an MP4 file etc.).  Liam also introduces the TransformManyBlock which takes an IEnumerable of things to process.  This means we no longer have to have our own loop through each of the files to be processed, instead, we can simply pass in the contents of the folder’s files as a complete IEnumerable collection.

Finally, Liam mentions both the BroadcastBlock and the BatchBlock.  The Broadcast block is effectively a pub/sub mechanism as used in Message Buses etc. which allows fanning-out of the messages and broadcasting to other blocks.  The BatchBlock allows batching of inputs before passing the messages along the processing pipeline.

All in all, Liam’s talk was very informative and shows just how far we’ve come in our ability to relatively easily and simply perform parallel processing in a multi-threaded manner, taking advantage of all of the cores available to us on a modern day machine.  Liam’s demo code has been made available on GitHub for those interested in learning more.

 

20141018_110411 The next talk is Ian Cooper’s“Not Just Layers! – What can pipelines and events do for you?”, which is a talk about Data Flow Architectures, and specifically Pipelines and Events.  Ian first talks about general software architecture and how processes evolve from basic application of a skill through to adoption of genuine craftsmanship and best-practices.  Software Architecture has many styles, but a single style can be explained as a series of component and connectors.  Components are the individual parts of an architecture that does something and the connectors are how multiple components talk to each other.

Ian states that Data Flow architectures are more driven by behaviour rather than state, and says that functional languages (such as F#) are better suited to behaviourally modelled architecture, whereas object oriented (OO) languages like C# are better suited to solve state driven processes and architectures.

Ian uses the KWIC (Keyword in context) algorithm, which is how Unix indexes text in its man pages, as the reference for the session.

Ian talks about pipes and filters, and states that it’s a flow of data processing along a pipeline of specific stages.  A push pipeline “pushes” tasks along the pipeline, the pipeline usually consisting of a pump at the front, which pushes data into the pipeline, with a series of filters which are the processing tasks and with each preceding filter responsible for pushing the data to the succeeding filter in the pipeline.  There’s also usually a sink at the end that provides the final end result.  There’s also Pull pipelines, of which .NET’s LINQ is an example, which have each filter further along the pipeline doing the pulling of the data from the previous filter, rather than the previous filter pushing the data on.

20141018_113104 Ian mentions how pipes and filters architecture is similar to a batch sequence architecture (See below for the subtle difference between them).  He talks about how errors that may happen in a long-running sequence that need the entire processing stream to be undo are better suited to a batch sequence architecture than a pipes and filter architecture, due to the more disconnected nature of the pipes and filter architecture.

Ian talks about parallel execution and the potential pub/sub problem of consumers awaiting data and not knowing when the entire workload is completed.  If individual steps are either faster or slower than the preceding or succeeding steps in the chain, this can cause problems with either no data, or too much data to process.  The solution to this problem is to introduce a “buffer” in between steps within the chain.  Such things as Message Queues (i.e. MSMQ, RabbitMQ etc) or in-memory caching mechanisms (such as those provided by tools like Redis) can offer this.

20141018_113427 Ian then show us an in-memory demo of a program using the pipes and filters architecture.  Ian states that, ideally, filters in a pipeline shouldn’t really know about other filters, but its okay for them to be aware of an abstraction of a new filter that’s next in the pipeline, but not the concrete instance of that filter.  Ian uses the KWIC algorithm for the demo code.  Ian shows the same demo using the manual pipeline and filters, and also a LINQ implementation.  The LINQ example has its filters implemented as fluent method calls simply chained together (i.e. TextLines.Shift(x=>x).RemoveNoise(x => x).Sort() etc.).  Ian then show the same example as written in F#.  This shows the pipeline, using F#’s pipeline operator “|>” is even simpler to see from the code that implements it.

Ian shows us the demo code using a message queue (using MSMQ behind the scenes), this shows a pull based pipeline where each filter down the chain pulls messages from a message queue to which messages are posted by the preceding filter in the pipeline chain.  Ian also shows us the pipeline running in a parallel manner, using the Task Parallel Library.  Each filter has distinct Inputs and Outputs defined as BlockingCollection<T> allowing the data to flow in and out, but to be blocked on the individual thread if the next filter in the pipeline isn’t ready to receive that data.

Finally, Ian talks about Batch Sequences and how they differ slightly from a pipes and filters architecture.  He talks about how you did Batch Sequencing many years ago with magnetic tapes being passed from one reel-to-reel processing machine to the next!  The main difference between Batch Sequence and Pipes and Filters is that in a batch sequence, each filter has to complete the entire workload of data before passing everything as output to the next filter in the chain.  By contrast, pipes and filters will have its filter only process one small piece of work or one individual piece of data before passing it down the processing chain.  This means that true pipes and filters is much better suited to being parallelized than a batch sequence architecture.

 

20141018_125418_LLS The next session is Richard Tasker’s“BDD and why you should be doing it”.  Richard starts by introducing BDD (Behaviour Driven Development) and where it originated.  It was first proposed by Dan North as a “solution” to some of the failings of TDD such as: Where do you start with TDD? What to test and what not to test? and How much to test in one go?

Richard starts by talking about his first exposures to understanding BDD.  This started with writing expressive names for standard unit tests.  This helps understand what the test is testing and thus, what the code is doing.  I.e. the expression of a behaviour of the code.  It’s from here that we can see how we can make the mental leap from testing and exercising small methods of of code, but a more user-centric behaviour of the overall application.

Richard shows a series of Database Entity Relationship diagrams as the first mechanism he used to design an application used to model car parts and their relation to vehicles.  This had to go through a number of iterations to fully realise the entities involved and their relationships to each other and it wasn’t the most effective way to achieve the overall design.  Using a series of User Stories which could be turned into BDD tests was the way forward.

Richard next introduced the MoSCoW method as the way in which he started writing his BDD tests.  Using this method combined with the new style of user story templates emphasises the behaviour and business function.  Instead of writing “As a <type of user> I want <some functionality> so that <some benefit>”, we instead write, “In order to <achieve some value>, as a <type of user>, I should have<some functionality>”.  The last part of the user story gets the relevant must/should/could/won’t wording in order to help achieve effective prioritization with the customer.

Cynefin_as_of_1st_June_2014 Richard then introduces SpecFlow as his BDD tool of choice.  He shows a simple demo of a single SpecFlow acceptance test, backed by a number of standard unit tests.  Richard says that you probably don’t want to do this for every individual tiny part of your application as this can lead to an abundance of unit tests and further lead to a test maintenance burden.  To help solve this, Richard talks about Decision Frameworks, of which a popular one is called “Cynefin”.   It defines states of Obvious, Chaotic, Complex and Complicated.  Each area of the application and discrete pieces of functionality can be assessed to see which of the four Cynefin states they may fall into.  From here, we can decide how many or how few BDD Acceptance tests are best utilised for that feature to deliver the best return on investment.  Richard says that Acceptance tests are often best used in Complicated & Complex states, but are often less useful in Obvious & Chaotic states.

Richard closes his session with “why” we should be doing BDD.  He talks about many of the benefits of adopting BDD and says that it is a great helper for teams that are new to TDD.  Richard says that BDD helps to reduce communication barriers between the developers and other technical professionals and the perhaps less technical business stakeholders and that BDD also helps with prioritizing which features should be implemented before others.  BDD also helps with naming things and defining the specific behaviours of our application in a more user-oriented way and also helps to define the meaning of “done”. 

 

20141018_131051_LLS After Richard’s talk, it was lunchtime.  Lunch was served in the same communal area where we’d all gathered earlier at breakfast time and consisted to a rather nice sandwich, a bag of crisps and a drink.  It was nice that all three ingredients could be chosen by each individual attendee from a selection available.

20141018_131444_LLS After enjoying this very nice lunch, I decided to skip the Grok talks (these are short, 10 minute talks that generally happen over lunchtime at the various DDD conferences) and get some fresh air outside.  That didn’t last too long, as I found the Pack Horse pub just down the road from the area of the university used for the conference.  This is a pub belonging to a small local microbrewery called The Burley Street Brewhouse.  I decided I had to go in and sneak a cheeky pint of bitter as a lunchtime treat.  It was indeed a lovely pint and afterwards, I headed back to the university and to the DDD North conference.  I went back in via an entrance close to the communal area still housing some conference attendees and realised that a number of sandwiches and crisps were still available for any attendee that wanted 2nd helpings!  I was still a bit peckish after my liquid refreshment (and knowing that I wouldn’t be eating until quite late in the evening at the after conference Geek Dinner) I decided to go for seconds!  After enjoying my second helpings, I headed off for the first session of the afternoon.

 

20141018_143120_LLS The first afternoon session is Andrew MacDonald’sCQRS& Event Sourcing”.  Andrew first talks about the how & why of starting development in a brand new project.  Andrew has his own development project, treevue.com, for which he decided to try out CQRS and event sourcing as they were two new interesting techniques that Andrew believed could help with the development of his software.  treevue.com is a web product which offers virtual data rooms.  Andrew talks about the benefits of CQRS & Event sourcing such as allowing a truly abstracted data storage model, providing domain driven design without noise and that separating reads and writes to the data model via CQRS could open up new possibilities for the software.  Andrew states that it’s not appropriate for everything and quotes Udi Dahan who said that most people who have used CQRS shouldn’t have done so!

CQRS is Command Query Responsibility Segregation and allows commands (processes that alter our data) to be separate from and entirely distinct from Queries (processes that only read our data but don’t change it).  The models behind each of these can be entirely different, even when referring to the same domain entities, so a data model for reading (for example) a Customer type can have a different design when reading than when writing.

Architectures Compared_thumbAndrew talks about the overall architecture of a system that employs CQRS vs. one that doesn’t.  Without CQRS, reads and writes flow through the same layers of our application.  With CQRS, we can have entirely different architectures for reading vs. writing.  Usually the writing architecture is similar to the entire non-CQRS architecture, flowing through many layers including data access, validation layers etc., but often the reading architecture uses a much flatter set of layers to read the data as concerns such as validation are generally not required in this context.  The two separate reading and writing stacks can often even connect to separate databases which provide “eventual consistency” with each other.  This also means reading and writing can scale independently of each other, and given that many apps read far more than write, this can be invaluable.

image19 Andrew then introduces Event Sourcing which, whilst separate and different from CQRS, does play well with it.  Andrew shows a typical relational model of a purchase order with multiple purchase order line item types related to it and a separate shipping info type attached.  This model only allows us to see the state of the order and its data as it stands right now.  Event sourcing shows the timeline of events against the purchase order as each alteration to the entity is stored separately in an event queue/database.  i.e. A line item is added with an (incorrect) quantity of 4.  But corrected with a later event deducting 2 from the line item, leaving a line item with a correct quantity of 2.  This provides us with the ability to not only see how the data looks “right now”, but to be able to create the entire state of the entity model at any given point in time.

Andrew then proceeds to talk about Azure’s role in his treevue app and how he’s utilised Azure’s Table Storage as a first class citizen.  He then shows us a quick demo and some code using EventProcessors and CommandProcessors which effectively implement the CQRS pattern. 

Finally, Andrew shows how he uses something called a “snapshot” when reading domain aggregates, which is effectively a caching layer used to improve performance around building the domain aggregate models from the various events that make up a specific state of the model as at a certain point in time.  This is particularly important when running applications in the cloud and using such technology as Azure Table Storage, as this will only serve back a maximum of 1000 rows per query before you, as the developer, have to make further requests for more data.  Andrew points out that the demo code is available on GitHub for those interested in diving deeper and learning more from his own implementation.

 

20141018_154117_LLS The final session for today is David Whitney’s“Lessons Learnt running a public API”.  David is a freelance consultant who has worked for many companies writing large public API’s.  The company used for reference during David’s talk is the work he did with Just Giving.  David states how the project to build the Just Giving API grew so large that the API eventually became the company’s biggest revenue stream.

David’s talk is a fast-paced set of tips, tricks and lessons that he has personally learned over the many years working with clients developing their large public-facing API’s.

David starts with stating that your API is your public facing contract to the world, and that it will live or die by the strength of it’s documentation.  If it’s bad, people will write bad implementations, and you can’t blame them when that happens.  Documentation for APIs can either be created first, which then drives the design of the API, or it can be performed the other way around, where you write the API first and document it afterwards.  Either approach is viable, so long as documentation does indeed exist and is sufficiently comprehensive to allow your consumers to build quality implementations of your API.  David says it’s often best to host the docs with the API itself so that if you hit the API endpoint with a web browser as a human user, you’ll serve up the API documentation.

David states that the DTO’s returned from API calls should provide “examples” of themselves.  This is a simple mechanism that lets users “discover” your API and helps them to understand just how they should use it.  Code such as this:

public interface IProvideAnExampleOf<TMyself>
{
    ExampleOf<TMyself>[] BuildExample();
}

public class ExampleOf<T>
{
    public string Description { get; set; }
    public T Example { get; set; }

    public ExampleOf(string description, T example)
    {
        Description = description;
        Example = example;
    }
}

will enable your API to provide examples of itself to your users.  David states that anything you can do to help your API consumers will greatly cut down the inevitable avalanche of help requests that will hit you.

Following on from individual examples, it’s good to have your API and it’s documentation provide “recipes” for how to use large sections of your API and how to call discrete service endpoints in a coherent chain in order to achieve a specific outcome.  Recipes help your users to “fall into the pit of success”.  Providing things like a complete web application, ideally written in multiple languages, that exercises various parts of your API is even better.

David next talks about versioning of your API, and says that it’s something you have to ensure you have a policy on from Day 1.  Retrofitting versioning is very hard and often leads to broken or awkward implementations.  Adding version numbers to the URI is perhaps the easiest to achieve, but it’s not really the best approach.  It’s far better to add the API version in the HTTP header.

He continues by talking about modifying existing API calls.  Don’t.  Just don’t do it at any cost!  If you really must, you can add additional data to the return values of your API endpoints, but you must never change or remove anything that’s already there.  You must also never rename anything.  If you need to do any of this, use a new version.  This leads into Content Types, and here David states that you’ll really need to provide all the different content types that people will realistically use.  Whilst many web developers today see JSON as the de-facto standard, many companies – especially large enterprises – are still using XML as their de-facto standard.  Your API is going to have to support both.  David also mentions that JSONP is another, growing, standard that you may well have to support, but be careful if you do as you’ll need to be mindful of possible errors caused by CORS (Cross Origin Resource Sharing) which is the ability of resources such as JavaScript to be able to be called from domains other than the one where the resource is hosted.

David talks about the importance of making Statistics for your API available and public.  You need to ensure you’re gathering performance and other statistics on every method call.  One possibility is returning some statistics back to the consumer directly in the HTTP response header after every request to your API, such as the server name that serviced the consumer’s request.  This is especially useful if you’ve got a large server farm and need help debugging service call issues.  Also you should ensure you publically expose your statistics in a dashboard via status updates, uptime pages and more.  For one, it’ll help you deflect any criticisms that your performance is broken, and it’ll provide consumers with confidence that your API is up, that it stays up and that you’re on top of maintaining this.  (Unless, of course, your performance really is broken in which case that same fancy dashboard will help you have visibility into diagnosing and correcting the issue!).  David next mentions the importance of a good staging server for user testing.  Don’t simply expose an internal “test” server that you may have cobbled together.  David relates first hand experience of just how difficult it can be getting users to stop using your “test” server after you’ve allow them access!

20141018_162628_LLS The next part of the session focuses on the overall approach to design of your API.  David stresses that it’s good to go back and read the original documentation on RESTful architecture, written by Roy Fielding as a doctoral dissertation back in the year 2000.  Further, it’s important to lean on existing conventions – always return canonical URI’s rather than relative ones and always supply ID’s and URI’s when returning data that refers to any domain or service entity.  As well as ensuring you follow existing standards, it’s also important to investigate new, emerging standards too.  Standards such as HAL (Hypertext Application Language) and JSON API can ensure that should such standards quickly become mainstream, you can adapt your API to support them.

David continues his session by talking about the cardinal sins of API design.  First thing you must never do is this:

{"PageType": 1,"SomeText": "This is some text"
}

What, exactly, is PageType 1?  We’re talking, of course, about magic numbers.  Don’t do it.  This forces your consumers to go off and look it up in the documentation, and whilst that documentation should definitely exist, there’s no reason why you can’t provide a more meaningful value to your consumer.  You have to think like a consumer at all times and try to imagine the applications they’re going to build using your API.  Also, don’t ever ask a user for data that your API itself can’t supply – i.e. Don’t ever request some specific identifier for a resource if you don’t provide that identifier when returning that resource in other requests.  Build your services RESTfully, don’t build XML-RPC with SOAP envelopes.  Be resource oriented, and always ensure you use the correct HTTP verbs for all of your services actions – especially understand the difference between POST & PUT.

Make sure you understand multi-tenancy and how that will impact the design and implementation of your API.  Good load balancers and proxies can balance based on request headers, so it’s really easy and useful to provide multi-tenancy in this manner.  Also ensure you use a good sandbox environment for testing and don’t forget to implement good rate limiting!   Users and consumers will make mistakes in their code and you don’t want them to take down your service when they do.

David talks about error handling and says you should validate everything you can when requests are made to your API.  Try to return errors in batches if possible, and always make sure that error messages are useful and readable.  Similar the magic numbers above, don’t return only an arcane error code to your consumers and force them to have to cross reference it from deep within your documentation.

20141018_163740_LLS David moves onto authentication for your API and states that this is an area that can get a bit painful.  Basic HTTP Auth will get you going, and can be sufficient if your API is (and will remain) fairly small scale, however, if your API is large or likely to grow to a larger scale – and especially if your API will be used by users via third-parties, you’ll quickly grow out of Basic Auth and need something more robust.  He says that OpenAuth is the best worst alternative.  It provides good security but can be painful to implement.  Fortunately, there are many third-party providers out there to whom you can outsource your authorisation concerns.

David then discusses providing support for your API to your users.  He says the best approach is to simply put it all out there in the public domain.  This provides transparency which is a good thing, but can also encourage a “self-service” model where people within the community will start to help provide answers and solutions to other community members.  Something as simple as a Google Group or a tag on Stack Overflow can get you started.

David closes his session by stating that, as your API grows over time, always ensure that you’re never attempting to serve only a single customer.  Keep your API clean and generic and it will remain useful to all consumers, rather than compromising that usefulness for just a minority of users.  And finally, if your API is or will become a first-class product for your business, just as the Just Giving API became for them, make sure you have a full product team within your business to deal with its day to day operation and its ongoing maintenance and development.  It’s all too easy to think that the API isn’t strictly a “product” due to its highly technical and slightly opaque nature, however, doing so would be a mistake.

 

20141018_173357_LLS After David’s session, we all congregated in the main lecture theatre for the wrap up presentation from Andy Westgarth, one of the conference organisers.  This involved thanking the very generous sponsors of the event as without them there simply wouldn’t be a DDD conference, and it also involved a prize giving session – the prizes consisting of books, T-shirts, some Visual Studio headphones and a main prize of a Surface Pro 3!

After the excellent day, I headed to the pub which was very conveniently located immediately across the road from the venue entrance.  I had a few hours to kill until the Geek Dinner which was to be held later that evening at Pizza Express in Leed’s Corn Exchange.  I enjoyed a couple of pints of Leeds Pale Ale before heading off to the Pizza Express venue for my dinner.

20141018_224309_LLS The Geek Dinner was attended by approximately 40 people and a fantastic time was had by all.  I was sat close one of the day’s earlier speakers, Andrew MacDonald, and we had a good old chin wag about past projects, work, and life as a software developer in general.

Overall, the DDD North 2014 event and the Geek Dinner afterwards was a fantastic success, and a great time was had by all.  Andy promised that there’d be another one in 2015, which will be held back up in the North-East of England due to the alternating location of DDD North, so here’s looking forward to another wonderful DDD North conference in 2015.

DDD South West 6 In Review

$
0
0

image(4) This past Saturday 25th April 2015 saw the 6th annual DDD South West event, this year being held at the Redcliffe Sixth Form Centre in image(2)Bristol.  This was my very first DDD South West event, having travelled south to the two DDD East Anglia events previously, but never to the south west for this one.

I’d travelled down south on the Friday evening before the event, staying in a Premier Inn in Gloucester.  This enabled me to only have a relatively short drive on the Saturday to get to Bristol and the DDD South West event.  After a restful night’s sleep in Gloucester, I started off on the journey to Bristol, arriving at one of the recommended car parks only a few minutes walk away from the DDDSW venue.

Upon arrival at the venue, I checked myself in and proceeded up the stairs to what is effectively the Sixth Form “common room”.  This was the main hall for the DDDSW event and where all the attendees would gather, have teas, coffees & snacks throughout the day.

image(7) Well, as is customary, the first order of business is breakfast!  Thanks to the generous sponsors of the event, we had ample amounts of tea, coffee and delicious danish pastries for our breakfast!  (Surprisingly, these delicious pastries managed to last through from breakfast to the first (and possibly second) tea-break of the day!)

image(10)Well, after breakfast there was a brief introduction from the organisers as to the day’s proceedings.  All sessions would be held in rooms  on the second floor of the building and all breaks, lunch and the final gathering for the customary prize draw would be held in the communal common room.  This year’s DDDSW had 4 main tracks of sessions with a further 5th track which was the main sponsors track.  This 5th track only had two sessions throughout the day whilst the other 4 had 5 sessions in each.

The first session of the day for me was “Why Service Oriented Architecture?” by Sean Farmar

image(9)Sean starts his talk by mentioning how "small monoliths" of applications can, over time and after many tweaks to functionality, become large monoliths and can become a maintenance nightmare which is both a high risk to the business and can lead to changes that are difficult to make and can have unforeseen side-effects.  When we’ve created a large monolith of an application, we’re frequently left with a “big ball of mud”.

Sean talks about one of his first websites that he created back in the early 1990’s.   It had around 5000 users, which by the standards of the day was a large number.  Both the internet and the web have grown exponentially since then, so 5000 users is very small by today’s standards.  Sean states that we can take those numbers and “add two noughts to the end” to get a figure for a large number of users today.  Due to this scaling of the user base, our application needs to scale too, but if we start on the path of creating that big ball of mud, we’ll simply create it far quicker today than we’ve ever done in the past.

Sean continues to state that after we learn from our mistakes with the monolithic big ball of mud, we usually move to web services.  We break a large monolith into much smaller monoliths, however, these webservices need to then talk both to each other as well to the consumers of the webservice. For example, the sales webservice has to talk to the user webservice which then possibly has to talk to the credit webservice in order to verify that a certain user can place an order of a specific size.  However, this creates dependencies between the various web services and each service becomes coupled in some way to one or more other services.  This coupling is a bad thing which prevents the individual web services from being able to exist and operate without the other webservices upon which it depends.

From here, we often look to move towards a Service Oriented Architecture (SOA).  SOA’s core tenets are geared around reducing this coupling between our services.

Sean mentions the issues with coupling:

Afferent (dependents) & Efferent (depends on) – These are the things that a given service depends upon and the other services that, in turn, depend upon the first service.
Temporal (time, RPC) – This is mostly seen in synchronous communications – like when a service performs a remote procedure call (RPC) to another service and has to wait for the response.  The time taken to deliver the response is temporal coupling of those services.
Spatial (deployment, endpoint address) – Sean explains this by talking about having numerous copies of (for example) a database connection string in many places.  A change to the database connection string can cause redeployments of complete parts of the system.

After looking at the problems with coupling, Sean moves on to looking at some solutions for coupling:  If we use XML (or even JSON) over the wire, along with XSD (or JSON Schema) we can define our messages and the transport of our messages using industry standards allowing full interoperability.  To overcome the temporal coupling problems, we should use a publisher/subscriber (pub/sub) communication mechanism.  Publishers do not need to know the exact receivers of a message, it’s the subscribers responsibility to listen and respond to messages that it is interested in when the publisher publishes the message.  To overcome the spatial issues, we can most often use a central message queue or service bus.  This allows publishers and subscribers to communicate with each other without hard references to the location of the publisher or subscriber on the network, they both only need to communicate to the single message bus endpoint.  This frees our application code from ever knowing who (or where) we  are “talking to” when sending a command or event message to some other service within the system, pushing these issues down to being an infrastructure rather than an application level concern.  Usage of a message bus also gives us durability (persistence) of our messages meaning that even if a service is down and unavailable when a particular event is raised, the service can still receive and respond to the event when it becomes available again at a later time. 

arch Sean then shows us a diagram of a  typical n-tier architecture system.  He mentions how “wide” the diagram is and how each “layer” of the application spans the full extent of that part of the system (i.e. the UI later is a complete layer than contains all of the UI for the entire system).  All of these wide horizontal layers are dependent upon the layer above or beneath it.

Within a SOA architecture, we attempt to take this n-tier design and “slice” the diagram vertically.  Therefore each of our smaller services each contain all of the layers - a service endpoint, business logic, data access layer and database - each in thin, focused vertical slices for specific focused areas of functionality.

arch2 Sean remarks that if we're going to build this kind of system, or modify an existing n-tier system into these vertical slices of services, we must start at the database layer and separate that out.  Databases have their own transactions, which in a large monolithic DB can lock the whole DB, locking up the entire system.  This must be avoided at all costs.

Sean continues to talk about how our services should be designed.  Our services should be very granular.  i.e. we shouldn't have an "UpdateUser" method that performs creation and updates of all kinds of properties of a "User" entity, we should have separate "CreateUser", "UpdateUserPassword", "UpdateUserPhoneNumber" methods instead.  The reason is that, during maintenance, constantly extending an "UpdateUser" method will force it to take more and more arguments and parameters and will grow extensively in lines of code as it tries to handle more and more properties of a “user” entity and it thus become unwieldy.  A simpler "UpdateUserPassword" is sufficiently granular enough that it'll probably never need to change over its lifetime and will only ever require 1 or 2 arguments/parameters to the method. 

Sean then asks how many arguments our methods should take.  He says his own rule of thumb for maximum arguments to any method is 2.  Once you find yourself needing 3 arguments, it's time to re-think and break up the method and create another new one.   By slicing the system vertically we do end up with many many methods, however, each of these methods are very small, very simple and are very specific with individual specific concerns.

Next we look at synchronous vs asynchronous calls.  Remote procedure calls (RPC) will usually block and wait as one service waits for a reply from another.  This won’t scale in production to millions of users.  We should use the pub/sub mechanism which allows for asynchronous messaging allowing services that require data from other services to not have to wait and block while the other service provides the data, it can subscribe to a message queue and be notified of the data when it's ready and available.

Sean goes on to indicate that things like a user’s address can be used by many services, however, it’s all about the context in which that piece of data is used by that service.  For this reason it’s ok for our system to have many different representations of, effectively, the same piece of data.  For example, to an accounting service, a user’s address is merely a string that gets printed onto a letter or an invoice and it has no further meaning beyond that.  However, to a shipping service, the user’s address can and probably will affect things like delivery timescales and shipping costs.

Sean ends his talk by explaining that, whilst a piece of data can be represented in different ways by different parts of the system, only one service ever has control to write that data whereas all other services that may need that data in their own representation will only ever be read-only.

 

image (15) The next session was Richard Dalton’s “Burnout”.  This was a fascinating session and is quite an unusual talk to have at these DDD events, albeit a very important talk to have, IMHO.  Richard’s session was not about a new technology or method of improving our software development techniques as many of the other sessions at the various DDD events are, but rather this session was about the “slump” that many programmers, especially programmers of a certain age, can often feel.  We call this “burnout”.

Richard started by pointing out that developer “burnout” isn’t a sudden “crash-and-burn” explosion that suddenly happens to us, but rather it’s more akin to a candle - a slow burn that gradually, but inevitably, burns away.  Richard wanted to talk about how burnout affected him and how it can affect all of us, and importantly, what can we do to overcome the burnout if and when it happens to us.  His talk is about “keeping the fire alive” – that motivation that gets you up in the morning and puts a spring in your step to get to work, get productive and achieve things.

Richard starts by briefly running through the agenda of his talk.  He says he’ll talk about the feelings of being a bad programmer, and the “slump” that you can feel within your career, he’ll talk about both the symptoms and causes of burnout, discuss our expectations versus the reality of being a software developer along with some anti-patterns and actions.

We’re shown a slide of some quite shocking statistics regarding the attrition rate of programmers.  Computer Science graduates were surveyed to see who was still working as a programmer after a certain length of time.  After 6 years, the amount of CS graduates still working as a programmer is 57%, however after 20 years, this number is only 19%.  It’s clear that the realistic average lifespan of a programmer is perhaps only around 20-30 years.

Richard continues by stating that there’s really no such thing as a “computer programmer” anymore – there no longer a job titled as such.  We’re all “software developers” these days and whilst that obviously entails programming of computers, it also entails far more tasks and responsibilities.  Richard talks about how his own burnout started and he first felt it was at least partially caused by his job and his then current employer.  Although a good and generous employer, they were one of many companies who claimed to be agile, but really only did enough to be able to use the term without really becoming truly agile.  He left this company to move to one that really did fully embrace the agile mantra however due to lots of long-standing technical debt issues, agile didn’t really seem to be working for them.  Clearly, the first job was not the cause (or at least not the only cause) of Richard’s burnout.  He says how every day was a bad day, so much so that he could specifically remember the good days as they were so rare and few and far between.

He felt his work had become both Dull and Overwhelming.  This is where the work you do is entirely unexciting with very little sense of accomplishment once performed, but also very overwhelming which was often manifested by taking far longer to accomplish some relatively simple task than should really have been taken, often due to “artificial complexity”.  Artificial complexity is the complexity that is not inherent within the system itself, but rather the complexity added by taking shortcuts in the software design in the name of expediency.  This accrues technical debt, which if not paid off quickly enough, leads to an unwieldy system which is difficult to maintain.  Richard also states how from this, he felt that he simply couldn’t make a difference.  His work seemed almost irrelevant in the grand scheme of things and this leads to frustration and procrastination.  This all eventually leads to feelings of self-doubt.

Richard continues talking about his own career and it was at this point he moved to Florida in the US where he worked for 1.5 years.  This was a massive change, but didn’t really address the burnout and when Richard returned he felt as though the entire industry had moved on significantly in those 1.5 years when he was away, whilst he himself had remained where he was before he went.  Richard wondered why he felt like this.  The industry had indeed changed in that time and it’s important to know that our industry does change at a very rapid pace.  Can we handle that pace of change?  Many more developers were turning to the internet and producing blogs of their own and the explosion of quality content for software developers to learn from was staggering.  Richard remarks that in a way, we all felt cleverer after reading these blogs full of useful knowledge and information, but we all feel more stupid as we feel that others know far more than we do.  What we need to remember is that we’re reading the blogs showing the “best” of each developer, not the worst.

We move on to actually discuss “What is burnout?”  Richard states that it really all starts with stress.  This stress is often caused by the expectation vs. reality gap – what we feel we should know vs. what we feel we actually do know.  Stress then leads to a cognitive decline.  The cognitive decline leads to work decline, which then causes further stress.  This becomes a vicious circle feeding upon itself, and this all starts long before we really consider that we may becoming burnt out.  It can manifest itself as a feeling of being trapped, particularly within our jobs and this leads itself onto feeling fatigued.  From here we can become irritable, start to feel self-doubt and become self-critical.  This can also lead to feeling overly negative and assuming that things just won’t work even when trying to work at them.  Richard uses a phrase that he felt within his own slump - “On good days he thought about changing jobs.  On bad days he thought about changing career”!  Richard continues by stating that often the Number 1 symptom of not having burnout is thinking that you do indeed have it.  If you think you’re suffering from burnout, you probably aren’t but when you do have it, you’ll know.

Now we’re moving on to look at what actually leads to burnout?  This often starts with a set of unclear expectations, both in our work life, but in our general life as a software developer.  It can also come from having too many responsibilities, sleep and relaxation issues and a feeling of sustained pressure.  This often all occurs within the overarching feelings of a weight of expectation versus the reality of what can be achieved.

Richard states that it was this raised expectation of the industry itself (witness the emergence of agile development practices, test-driven development practices and a general maturing of many companies’ development processes and practices in a fairly short space of time) and the disconnect with reality, which Richard felt simply didn’t live up to the expectations that ultimately lead to him feeling a great amount of stress.  For Richard, it was specifically around what he felt was a “bad” implementation of agile software development which actually created more pressure and artificial work stress.  The implementation of a new development practice that is supposed to improve productivity naturally raises expectations, but when it goes wrong, it can widen the gap between expectation and reality causing ever more stress.  He does further mention that this trigger for his own feelings of stress may or may not be what could cause stress in others.

Richard talks about some of the things that we do as software developers that can often contribute to the feelings of burnout or of increasing stress.  He discusses how software frameworks – for example the recent explosion of JavaScript frameworks – can lead to an overwhelming amount of choice.  Too much choice then often leads to paralysis and Richard shares a link to an interesting video of a TED talk that confirms this.  We then move on to discuss software side projects.  They’re something that many developers have, but if you’re using a side-project as a means to gain fulfilment when that fulfilment is lacking within your work or professional life, it’s often a false solution.  Using side-projects as a means to try out and learn a new technology is great, but they won’t fix underlying fulfilment issues within work.  Taking a break from software development is great, however, it’s often only a short-term fix.  Like a candle, if there’s plenty of wax left you can extinguish the candle then re-light it later, however, if the candle has burned to nothing, you can’t simply re-ignite the flame.  In this case, the short break won’t really help the underlying problem.

Richard proceeds to the final section of his talk and asks “what can we do to combat burnout?”  He suggests we must first “keep calm and lower our expectations!”.  This doesn’t mean giving up, it means continuing to desire the professionalism within both ourselves and the industry around us, but acknowledging and appreciating the gap that exists between expectation and reality.  He suggests we should do less and sleep more. Taking more breaks away from the world of software development and simply “switching off” more often can help recharge those batteries and we’ll come back feeling a lot better about ourselves and our work.  If you do have side-projects, make it just one.  Many side-projects is often as a result of starting many things but finishing none.  Starting only one thing and seeing it through to the finish is a far better proposition and provides for a far greater sense of accomplishment.  Finally, we look at how we can deal with procrastination.  Richard suggests one of the best ways to overcome it in work is to pair program.

Finally, Richard states that there’s no shame in burnout.  Lots of people suffer from it even if they don’t call it burnout, whenever you have that “slump” of productivity it can be a sign that it’s time to do something about it.  Ultimately, though, we each have to find our own way through it and do what works for us to overcome it.

 

image (19) The final talk before lunch was on the sponsor’s track, and was “Native Cross-Platform mobile apps with C# & Xamarin.Forms” by Peter Major.  Peter first states his agenda with this talk and that it’s all about Xamarin, Xamarin.Forms and what they both can and can’t do and also when you should use one over the other.

Peter starts by indicating that building mobile apps today is usually split between taking a purely “native” approach – where we code specifically for the target platform and often need multiple teams of developers for each platform we’ll be supporting – versus a “hybrid” approach which often involves using technologies like HTML5 and JavaScript to build a cross-platform application which is then deployed to each specific platform via the use of a “container” (i.e. using tools like phonegap or Apache’s Cordova).

Peter continues by looking at what Xamarin is and what is can do for us.  Xamarin allows us to build mobile applications targeting multiple platforms (iOS, Android, Windows Phone) using C# as the language.  We can leverage virtually all of the .NET or Mono framework to accomplish this.  Xamarin provides “compiled-to-native” code for our target platforms and also provides a native UI for our target platforms too, meaning that the user interface must be designed and implemented using the standard and native design paradigms for each target platform.

Peter then talks about what Xamarin isn’t.  It’s not a write-once, run-anywhere UI, and it’s not a replacement for learning about how to design effective UI’s for each of the various target platforms.  You’ll still need to know the intricacies for each platform that you’re developing for.

Peter looks at Xamarin.iOS.   He states that it’s AOT (Ahead-Of-Time) compiled to an ARM assembly.  Our C# source code is pre-compiled to IL which in turn is compiled to a native ARM assembly which contains the MONO framework embedded within it.  This allows us as developers to use virtually the full extent of the .NET / Mono framework.  Peter then looks at Xamarin.Android.  This is slightly different to Xamarin.iOS as it’s still compiled to IL code, but then the IL code is JIT (Just-In-Time) compiled inside of a MONO Virtual Machine within the Android application.  It doesn’t run natively inside the Dalvik runtime on Android.  Finally, Peter looks at Xamarin.WindowsPhone.  This is perhaps the simplest to understand as the C# code is compiled to IL and this IL can run (in a Just-In-Time manner) directly against the Windows Phone’s own runtime.

Peter then looks at whether we can use our favourite SDK’s and NuGet Packages in our mobile apps.  Generally, the answer is yes.  SDK’s such as Amazon’s Kinesis for example are fully usable, but NuGet packages need to target PCL’s (Portable Class Libraries) if they’re to be used.

Peter asks whether applications built with Xamarin run slower than pure native apps, and the answer is that they generally run at around the same speed.  Peter shows some statistics around this however, he does also state that the app will certainly be larger in size than a natively written app.  Peter indicates, though, that Xamarin does have a linker and so it will build your app with a cut-down version of the Mono Framework so that it’ll only include those parts of the framework that you’re actually using.

We can use pretty much all C# code and target virtually all of the .NET framework’s classes when using Xamarin with the exception of any dynamic code, so we can’t target the dynamic language runtime or use the dynamic keyword within our code.  Because of this, usage of certain standard .NET frameworks such as WCF (Windows Communication Foundation) should be done very carefully as there can often be dynamic types used behind the scenes.

Peter then moves on to talk about the next evolution with Xamarin, Xamarin.Forms.  We’re told that Xamarin.Forms is effectively an abstraction layer over the disparate UI’s for the various platforms (iOS, Android, Windows Phone).  Without Xamarin.Forms, the UI of our application needs to be designed and developed to be specific for each platform that we’re targeting, even if the application code can be shared, but with Xamarin.Forms the amount of platform specific UI code is massively reduced.  It’s important to note that the UI is not completely abstracted away, there's still some amount of specific code per platform, but it's a lot less than when using "standard" Xamarin without Xamarin.Forms.

Developing with Xamarin.Forms is very similar to developing a WPF (Windows Presentation Foundation) application.  XAML is used for the UI mark-up, and the premise is that it allows the developer to develop by feature and not by platform.  Similarly to WPF, the UI can be built up using code as well as XAML mark-up, for example:

Content = new StackPanel().AddChildren(new Button() { Content = "Normal" });

Xamarin.Forms works by taking our mark-up that defines the placement of Xamarin.Forms specific “controls” and user interface elements and converting them using a platform-specific “renderer” to a native platform control.  By default, using the standard build-in renderers means that our apps won’t necessarily “look" like the native apps you’d find on the platform.  You can customize specific UI elements (i.e. a button control) for all platforms, or you can make the customisation platform specific.  This is achieved with a custom Renderer class that inherits from the EntryRenderer and adds the required customisations that are specific to the platform that is being targeted.

Peter continues to tell us that Xamarin.Forms apps are best developed using the MVVM pattern.  MVVM is Model-View-ViewModel and allows a good separation of concerns when developing applications, keeping the application code separate from the user interface code.  This mirrors the best-practice for development of WPF applications.  Peter also highlights the fact that most of the built-in controls will provide two-way data binding right out of the box.  Xamarin.Forms has "attached properties" and triggers.  You can "watch" a specific property on a UI element and in response to changes to the property, you can alter other properties on other UI elements.  This provides a nice and clean way to effectively achieve the same functionality as the much old (and more verbose) INotifyPropertyChanged event pattern provides.

Peter proceeds to talk about how he performs testing of his Xamarin and Xamarin.Forms apps.  He says he doesn’t do much unit testing, but performs extensive behavioural testing of the complete application instead.  For this, he recommends using Xamarin’s own Calabash framework for this.

Peter continues by explaining how Xamarin.Forms mark-up contains built-in simple behaviours so, for example, you can check a textbox's input is numeric without needing to write your own code-behind methods to perform this functionality.  It can be as simple as using mark-up similar to this:

<Entry Placeholder="Sample"><Entry.Behaviors><Entry.NumericTextboxBehaviour></Entry.Behaviors></Entry>

Peter remarks about speed of Xamarin.Forms developed apps and concludes that they are definitely slower than either native apps or even normal Xamarin developed apps.  This is, unfortunately, the trade-off for the improved productivity in development.

Finally, Peter concludes his talk by summarising his views on Xamarin.Forms.  The good:  One UI Layout and very customizable although this customization does come with a fair amount of initial investment to get platform-specific customisations looking good.  The bad:  Xamarin.Forms does still contain some bugs which can be a development burden.  There’s no XAML “designer” like there is for WPF apps – it all has to be written in a basic mark-up editor. Peter also states how the built-in Xamarin.Forms renderers can contain some internal code that is difficult to override, thus limiting the level of customization in certain circumstances.  Finally, he states that Xamarin.Forms is not open source, which could be a deciding factor for adoption by some developers.

 

IMG_20150425_131838 After Peter’s talk it was time for lunch!  Lunch at DDDSW was very fitting for the location in which we were in, the South-West of England.  As a result, lunch consisted of a rather large pasty of which we could choose between Steak or Cheese & Onion varieties, along with a packet of crisps, and a piece of fruit (a choice of apples, bananas or oranges) along with more tea and coffee!  I must say, this was a very nice touch – especially having some substantial hot food and certainly made a difference from a lot of the food that is usually served for lunch at the various DDD events (which is generally a sandwich with no hot food options available).

IMG_20150425_131849 After scoffing my way through the large pasty, my crisps and the fruit – after which I was suitably satiated – I popped outside the building to make a quick phone call and enjoy some of the now pleasant and sunny weather that had overcome Bristol.

IMG_20150425_131954 After a pleasant stroll around outdoors during which I was able to work off at least a few of the calories I’d just consumed, I headed back towards the Redcliffe Sixth Form Centre for the two remaining sessions of the afternoon.

I headed back inside and headed up the stairs to the session rooms to find the next session.  This one, similar to the first of the morning was all about Service Oriented Architecture and designing distributed applications.

image (1) So the first of the afternoon’s sessions was “Introduction to NServiceBus and the Particular Platform” by Mauro Servienti.  Mauro’s talk was to be an introduction to designing and building distributed applications with a SOA (Service Oriented Architecture) and how we can use a platform like NServiceBus to easily enable that architecture.

Mauro first starts with his agenda for the talk.  He’ll explain what SOA is all about, then he’ll move on to discuss long running workflows in a distributed system and how state can be used within.  Finally, he’ll look at asynchronous monitoring of asynchronous processes for those times when something may go wrong and allow us to see where and when it did.

Mauro starts by explaining the core tenets of NServiceBus.  Within NServiceBus, all boundaries are explicit.  Services are constrained and cannot share things between them.  Services can share schema and a contract but never classes.  Services are also autonomous, and service compatibility is based solely upon policy.

NServiceBus is built around messages.  Messages are divided into two types, commands and events. Each messages is an atomic piece of information and is used to drive the system forward in some way.  Commands are imperative messages and are directed to a well known receiver.  The receiver is expected (but not compelled) to act upon the command.  Events are also messages that are an immutable representation of something that has already happened.  They are directed to anyone that is interested.  Commands and events are messages with a semantic meaning and NServiceBus enforces the semantic of commands and events - it prevents trying to broadcast a Command message to many different, possibly unknown, subscribers and enforces this kind of “fire-and-forget” publishing only to Event messages.

We’re told about the two major messaging patterns.  The first is request and response.  Within the request/response pattern, a message is sent to a known destination - the sender knows the receiver perfectly but the receiver doesn't necessarily know the sender.  Here, there is coupling between the sender and the receiver.  The other major message pattern is publish and subscribe (commonly referred to as pub/sub).  This pattern has constituent parts of the system become “actors”, and each “actor” in the system can act on some message that is received.  Command messages are created and every command also raises an event message to indicate that the command was requested.  These event messages are published and subscribers to the event can subscribe and receive these events without having to be known to the command generator.  Events are  broadcast to anyone interested and subscribers can subscribe, listen and act on the event, or not act on the event.  Within a pub/sub system, there is much less coupling between the system’s constituent parts, and the little coupling that exists is inverted, that is, the subscriber knows where the publish is, not the other way round.

In a pub/sub pattern, versioning is the responsibility of the publisher.  The publisher can publish multiple versions of the same event each time an event is published.  This means that we can have numerous subscribers, each of which can be listening for, and acting upon different versions of the same event message.  As a developer using NServiceBus, your job is primarily to write message handlers to handle the various messages passing around the system.  Handlers must be stateless.  This helps scalability as well as concurrency.  Handlers live inside an “Endpoint” and are hosted somewhere within the system.  Handlers are grouped by "services" which is a logical concept within the business domain (i.e. shipping, accounting etc.).  Services are hosted within Endpoints, and Endpoint instances run on a Windows machine, usually as a Windows Service.

NServiceBus messages are simply classes.  They must be serializable to be sent over the wire.  NServiceBus messages are generally stored and processed within memory, but can be made durable so that if a subscriber fails and is unavailable (for example, the machine has crashed or gone down) these messages can be retrieved from persistent storage once the machine is back online.

NServiceBus message handlers are also simply classes, which implement the IHandleMessages generic interface like so:

public class MyMessageHandler : IHandleMessages<MyMessage>
{
}

So here we have a class defined to handle messages implemented by the class MyMessage.

NServiceBus endpoints are defined within either the app.config or the web.config files within the solution:

<UnicastBusConfig><MessageEndpointMappings><add Assembly="MyMessages" Endpoint="MyMessagesEndpoint" /></MessageEndpointMappings></UnicastBusConfig>

Such configuration settings are only required on the Sender of the message.  There is no need to configure anything on the message receiver.

NServiceBus has a BusConfiguration class.  You use it to define which messages are defined as commands and which are defined as events.  This is easily performed with code such as the following:

var cfg = new BusConfiguration();

cfg.UsePersistence<InMemoryPersistence>();
cfg.Conventions()
    .DefiningCommandsAs( t => t.Namespace != null && t.Namespace.EndsWith( ".Commands" ) )
    .DefiningEventsAs( t => t.Namespace != null && t.Namespace.EndsWith( ".Events" ) );

using ( var bus = Bus.Create( cfg ).Start() )
{
    Console.Read();
}

Here, we’re declaring that the Bus will use in-memory persistence (rather than any disk-based persistence of messages), and we’re saying that all of our command messages are defined within a namespace that ends with the string “.Commands” and that all of our event messages are defined within a namespace ending with the string “.Events”.

Mauro then shows all of this theory with some code samples. He has an extensive set of samples that show all virtually all aspects of NServiceBus and this solution is freely available on GitHub at the following URL:  https://github.com/mauroservienti/NServiceBus.Samples

Mauro goes on to state that when sending and recieving commands, the subscriber will usually work with concrete classes when handling messages for that specific command, however, when sending or receiving event messages, the subscriber will work with interfaces rather than concrete classes.  This is a best practice and helps greatly with versioning.

NServiceBus allows you to use your own persistence store for persisting messages.  A typical store used is RavenDB, but virtually anything can be used.  There's only two interfaces that need to be implemented by a storage provider, and many well-known databases and storage mechanisms (RavenDB, NHibernate/SQL Server etc.) have integrations with NServiceBus such that they can be used as persistent storage. NServiceBus can also use third-party message queues.  MSMQ, RabbitMQ, SQL Server, Azure ServiceBus etc. can all be used.  By default NServiceBus uses the built-in Windows MSMQ for the messaging.

Mauro goes on to talk about state.  He asks, “What if you need state during a long-running workflow of message passing?”  He explains how NServiceBus accomplishes this using “Sagas”.    Sagas are durable, stateful and reliable, and they guarantee state persistence across message handling.  They can express message and state correlation and they empower "timeouts" within the system to make decisions in an asynchronous world – i.e. they allow a command publisher to be notified after a specific "timeout" of elapsed time as to whether the command did what was expected or if something went wrong.   Mauro demonstrates this using his NServiceBus sample code.

Mauro explains how the business endpoints are responsible for storing the business state used at each stage (or step) of a saga.  The original message that kicks off a saga only stores the "orchestration" state of the saga (for example, an Order management service could start a  saga that uses an Order Creation service, a Warehouse Service and a Shipping service that creates an order, picks the items to pack and then finally ships them).

The final part of Mauro’s talk is about monitoring and how we can monitor the various messages and flow through all of the messages passing around an inherently asynchronous system.  He states that auditing is a key feature, and that this is required when we have many asynchronous messages floating around a system in a disparate fashion.  NServiceBus provides some "behind-the-scenes" part of the software called "ServiceControl".  ServiceControl sits in the background of all components within a system that are publishing or subscribing to NServiceBus messages and it keeps it's own copy of all messages sent and received within that entire system.  It therefore allows us to have a single place where we can get a complete overview of all of the messages from the entire system along with their current state.

serviceinsight-sagaflow The company behind NServiceBus also provides separate software called “ServiceInsight”, which Mauro quickly demonstrates to us showing how it provides a holistic overview and monitoring of the entire message passing process and the instantiation and individual steps of long-running sagas.  It displays all of this data in a user interface that looks not dissimilar to a SSIS (SQL Server Integration Service) workflow diagram. 

Mauro states that handling asynchronous messages can be hard.  In a system built with many disparate messages, we cannot ever afford to lose a single message.  To prevent message loss, Mauro says that we should never use try/catch blocks within our business code.  He states that NServiceBus will automatically "add" this kind error handling within the creation, generation and sending of messages.  We need to consider transient failures as well as business errors.  NServiceBus will perform it’s own retries for transient failures of messages but business errors must be handled by our own code.  Eventually, transient errors in sent messages that fail to be delivered after the configured amount of maximum retries are placed into a special error message queue by NServiceBus itself, and this allows us to handle these failed messages in this queue as special cases.  To this end, Particular Software also have a separate piece of software called "ServicePulse" which allows monitoring of the entire the infrastructure.  This includes all message endpoints to see if they’re up and available to send/receive messages and well as full monitoring of the failed message queue.

IMG_20150425_155100image (3) After Mauro’s talk it was time for another break.  Unlike the earlier breaks throughout the day, this one was a bit special.  As well as the usual teas and coffees that were available all day long, this break treated all of the attendees to some lovely cream teas!  This was a very pleasant surprise and ensured that all conference attendees were incredibly well-fed throughout the entire conference.  Kudos to the organisers, and specifically the sponsors who allowed all this to be possible.

 

After our lovely break with the coffee and cream teas, it was on to the second session of the afternoon and indeed, the final session of the whole DDD event.  The final session was entitled “Monoliths to Microservices : A Journey”, presented by Sam Elamin.

IMG_20150425_160220 Sam works for Just Eat, and his talk is all about how he has gone on a journey within his work to move from large, monolithic applications to re-implementing the required functionality in a more leaner, distributed system composed largely of micro-services.

Sam firsts mentions the motivation behind his talk: Failure.  He describes how we learn from our failures, and states that we need to talk about our failures more as it’s only from failure that we can actually really improve. 

He asks, “Why do we build monoliths?”  As developers, we know it will become painful over time but we build these monolithic systems because we’re building a system very quickly in order to ship it fast.  People then use our system and we, over time, add more and more features into it. We very rarely, if ever, get the opportunity to go back and break things down into better structured code and implement a better architecture.  Wanting to spend time performing such work is often a very hard sell to the business as we’re talking to them about a pain that they don’t feel.  It’s only the developers who feel the pain during maintenance work.

Sam then states that it’s not all a bed of roses if we break down our systems into smaller parts.  Breaking down a monolithic application into smaller components reduces the complexity of each individual component but that complexity isn’t removed from the system.  It’s moved from within the the individual components to the interactions between the components.

Sam shares a definition of "What is a microservice?"  He says that Greg Young once said, "It’s anything you can rewrite in a week".  He states that a micro service should be a "business context", i.e. a single business responsibility and discrete piece of business functionality.

But how do we start to move a monolithic application to a smaller, microservices-based application?  Well, Sam tells us that he himself started with DDD (Domain Driven Design) for the design of the application and to determine bounded contexts– which are the distinct areas of services or functionality within the system.  These boundaries would then communicate, as the rest of the system communicated, with messages in a pub/sub (Publisher/Subscriber) mechanism, and each conceptual part of the system was entirely encapsulated by an interface – all other parts of the system could only communicate through this interface.

Sam then talks about something that they hadn’t actually considered when the first started on the journey: Race Hazards.  Race Hazards, or Race Conditions as they can also be known, within a distributed message-based architecture are when there are failures in the system due to messages being lost or being recieved out of order and the inability of the system to deal with this.  Testing for these kind of failures is hard as asynchronous messages can be difficult to test by their very nature. 

Along the journey, Sam discovered that things weren’t proceeding as well as expected.  The boundaries within the system were unclear and there was no clear ownership of each bounded context within the business.  This is something that is really needed in order for each context to be accurately defined and developed.  It’s also really important to get a good ubiquitous language - which is a language and way of talking about the system that is structured around the domain model and used by all team members to connect all the activities of the team with the software - correct so that time and effort is not wasted trying to translate between code "language" and domain language.

Sam mentioned how the teams’ overly strict code review process actually slowed them down.  He says that Code Reviews are usually used to address the symptom rather than the underlying problem which is not having good enough tests of the software, services and components.  He says this also applies to ensuring the system has the necessary amount of monitoring, auditing and metrics implemented within to ensure speedy diagnosis of problems.

Sam talks about how, in a distributed system of many micro-services, there can be a lot of data duplication.  One area of the system can deal with it’s own definition of a “customer”, whilst another area of the system deals with it’s own definition of that same “customer” data.  He says that businesses fear things like data duplication, but that it really shouldn't matter in a distributed system and it's often actually a good thing – this is frequently seen in systems that implement CQRS patterns, eventual consistency and correct separation of concerns and contexts due to implementation of DDD.  Sam states that, for him, code decoupling is vastly favourable to code duplication – if you have to duplicate some code in two different areas of the system in order to correctly provide a decoupled environment, then that’s perfectly acceptable and that introducing coupling simply to avoid code duplication is bad.

He further states that business monitoring (in production monitoring of the running application and infrastructure) is also favourable to acceptance tests.  Continual monitoring of the entire production system is the most useful set of metrics for the business, and that with metrics comes freedom.  You can improve the system only when you know what bits you can easily replace, and only when you know what bits actually need to be replaced for the right reasons (i.e. replacing one component due to low performance where the business monitoring has identified that this low-performance component is a genuine system bottleneck).   Specifically, business monitoring can provide great insights not just into the system’s performance but also the businesses performance and trends, too.  For example, monitoring can surface data such as spikes in usage.  From here we can implement alerts based upon known metrics – i.e. We know we get around X number of orders between 6pm and 10pm on a Friday night, if this number drops by Y%, then send an alert.

Sam talks about "EventStorming" (a phrase coined by Alberto Brandolini) with the business/domain experts.  He says he would get them all together in a room and talk about the various “events” and the various “commands” that exist within the business domain, but whilst avoiding any vocabulary that is technical. All language used is expressed within the context of the business domain. (i.e. Order comes in, Product is shipped etc).  He states that using Event Storming really helped to move the development of system forward and really helped to define both the correct boundaries of the domain contexts, and helped to define the functionality that each separate service within the system would provide.

Finally, Sam says the downside of moving to microservices are that it's a very time-consuming approach and can be very expensive (both in terms of financial cost and time cost) to define the system, the bounded contexts and the individual commands and events of the system.  Despite this, it’s a great approach and using it within his own work, Sam has found that it’s provided the developers within his company with a reliable, scalable and maintainable system, and most importantly it’s provided the business with a system that supports their business needs both now and into the future.

IMG_20150425_170540

After Sam’s session was over, we all reconvened in the common room and communal hall for the final section of the day.  This was the prize-draw and final wrap-up.

The organisers first thanked the very generous sponsors of the event as without them the event simply would not have happened.  Moreover, we wouldn’t have been anywhere nearly as well fed as we were! 

There were a number of prize draws, and the first batch was a prize from each of the in-house sponsors who had been exhibiting at the event.  Prizes here ranged from a ticket to the next NDC conference to a Raspberry Pi kit.

After the in-house sponsors had given away their individual prizes, there was a “main” prize draw based upon winners drawn randomly from the event feedback that was provided about each session by each conference attendee.  Amongst the prizes were iPad mini’s, a Nexus 9 tablet, technical books, laser pens and a myriad of software licenses.  I sat as the winner’s names were read out, watching as each person as called and the iPad’s and Nexus 9 were claimed by the first few people who were drawn as a winner.  Eventually, my own name was read out!  I was very happy and went up to the desk to claim my prize.  Unfortunately, the iPad’s and Nexus 9 were already gone, but I managed to get myself a license for PostSharp Ultimate!IMG_20150504_143116

After this, the day’s event was over.  There was a customary geek dinner that was to take place at a local Tapas restaurant later in the evening, however, I had a long drive home from Bristol back to the North-West ahead of me so I was unable to attend the geek dinner after-event.

So, my first DDD South-West was over, and I must say it was an excellent event.  Very well run and organised by the organisers and the staff of the venue and of course, made possible by the fantastic sponsors.  I’d had a really great day and I can’t wait for next year’s DDDSW event!

SSH with PuTTY, Pageant and PLink from the Windows Command Line

$
0
0

I’ve recently started using Git for my revision control needs, switching from Mercurial that I’ve previously used for a number of years.  I had mostly used Mercurial from a GUI, namely TortoiseHg, only occasionally dropping to the command line for ad-hoc Mercurial commands.

In switching to Git, I initially switched to an alternative GUI tool, namely SourceTree, however I very quickly decided that this time around, I wanted to try to use the command line as my main interface with the revision control tool.  This was a bold move as the Git syntax is something that had always put me off Git and made me heavily favour Mercurial, due to Mercurial’s somewhat nicer command line syntax and generally “playing better” with Windows.

So, I dived straight in and tried to get my GitHub account all set up on a new PC, accessing Git via the brilliant ConEmu terminal and using SSH for all authentication with GitHub itself.  As this is Windows, the SSH functionality was provided by PuTTY, and specifically by the PLink and Pageant utilities within the PuTTY solution.

imageI already had an SSH Key generated and registered with GitHub, and the private key was loaded into Pageant, which was running in the background on Windows.  The first little stumbling block was to get the command line git tool to realise it had to use the PuTTY tools in order to retrieve the SSH Key that was to be used for authentication.

image This required adding an environment variable called GIT_SSH which points to the path of the PuTTY PLINK.exe program.  Adding this tells Git that it must use PLink, which acts as a kind of “gateway” between the program that needs the SSH authentication, and the other program – in this case PuTTY’s Pageant – that is providing the SSH Key.  This is a required step, and is not the default when using Git on Windows as Git is really far more aligned to the Unix/Linux way of doing things.  For SSH on Unix, this is most frequently provided by OpenSSH.

After having set up this environment variable, I could see that Git was attempting to use the PLINK.EXE program to retrieve the SSH key loaded into Pageant in order to authenticate with GitHub, however, there was a problem.  Although I definitely had the correct SSH Key registered with GitHub, and I definitely had the correct SSH Key loaded in Pageant (and Pageant was definitely running in the background!), I was continually getting the following error:

image

The clue to what’s wrong is there in the error text – The server that we’re trying to connect to, in this case it’s github.com, does not have it’s RSA key “installed” on our local PC.  I say “installed” as the PuTTY tools will cache remote server RSA keys in the Windows registry.  If you’re using OpenSSH (either on Windows or more likely on Unix/Linux, they get cached in a completely different place). 

Although the error indicates the problem, unfortunately it gives no indication of how to correct it.

The answer lies with the PLINK.exe program.  We have to issue a special one-off PLINK command to have it connect to a remote server, retrieve that server’s RSA key, then cache (or “install”) the key in the registry to allow subsequent usage of PLINK as a “gateway” (i.e. when called from the git command line tool) to be able to authenticate the server machine first, before it even attempts to authenticate with our own SSH key.

The plink command is simply:

plink.exe -v -agent git@github.com

or

plink.exe -v -agent git@bitbucket.org

(the git@github.com or git@bitbucket.org parts of the command are the specific email addresses required when authenticating with the github or bitbucket servers, respectively).

The –v simply means verbose output and can be safely omitted.  The real magic is in the –agent command which instructs plink to use Pageant for the key:

image

Now we get the opportunity to actually “store” (i.e. cache or install) the key.  If we say yes, this adds the key to our Windows Registry:

image

Once we’ve completed this step, we can return to our command window and attempt our usage of git against our remote repository on either GitHub or BitBucket once more.  This time, we should have much more success:

image

And now everything works as it should!

OWIN-Hosted Web API in an MVC Project – Mixing Token-based auth with FormsAuth

$
0
0

One tricky little issue that I recently came across in a new codebase was having to extend an API written using ASP.NET Web API 2.2 which was entirely contained within an ASP.NET MVC project.  The Web API was configured to use OWIN, the abstraction layer which helps to remove dependencies upon the underlying IIS host, whilst the MVC project was configured to use System.Web and communicate with IIS directly.

The intention was to use Token-based Http Basic authentication with the Web API controllers and actions, whilst using ASP.NET Membership (Forms Authentication) with the MVC controllers and actions.  This is fairly easy to initially hook up, and all authentication within the Web API controllers was implemented via a customized AuthorizationFilterAttribute

[AttributeUsage(AttributeTargets.Class | AttributeTargets.Method, AllowMultiple = false)]
public class TokenAuthorize : AuthorizationFilterAttribute
{
    bool _active = true;

    public TokenAuthorize() { }

    /// <summary>
    /// Overriden constructor to allow explicit disabling of this filter's behavior.
    /// Pass false to disable (same as no filter but declarative)
    /// </summary>
    /// <param name="active"></param>
    public TokenAuthorize(bool active)
    {
        _active = active;
    }

    /// <summary>
    /// Override to Web API filter method to handle Basic Auth check
    /// </summary>
    /// <param name="actionContext"></param>
    public override void OnAuthorization(System.Web.Http.Controllers.HttpActionContext actionContext)
    {
        // Quit out here if the filter has been invoked with active being set to false.
        if (!_active) return;

        var authHeader = actionContext.Request.Headers.Authorization;
        if (authHeader == null || !IsTokenValid(authHeader.Parameter))
        {
            // No authorization header has been supplied, therefore we are definitely not authorized
            // so return a 401 unauthorized result.
            actionContext.Response = actionContext.ControllerContext.Request.CreateErrorResponse(HttpStatusCode.Unauthorized, Constants.APIToken.MissingOrInvalidTokenErrorMessage);
        }
    }

    private bool IsTokenValid(string parameter)
    {
        // Perform basic token checking against a value
        // stored in a database table.
        return true;
    }
}

This is hooked up onto a Web API controller quite easily with an attribute, applied at either the class or action method level:

[RoutePrefix("api/content")]
[TokenAuthorize]
public class ContentController : ApiController
{
    [Route("v1/{contentId}")]
    public IHttpActionResult GetContent_v1(int contentId)
    {
        var content = GetIContentFromContentId(contentId);
        return Ok(content);
    }
}

Now, the problem with this becomes apparent when a client hits an API endpoint without the relevant authentication header in their HTTP request.  Debugging through the code above shows the OnAuthorization method being correctly called and the Response being correctly set to a HTTP Status Code of 401 (Unauthorized), however, watching the request and response via a web debugging tool such as Fiddler shows that we’re actually getting back a 302 response, which is the HTTP Status code for a redirect.  The client will then follow this redirect with another request/response cycle, this time getting back a 200 (OK) status with a payload of our MVC Login page HTML.  What’s going on?

Well, despite correctly setting our response as a 401 Unauthorized, because we’re running the Web API Controllers from within an MVC project which has Forms Authentication enabled, our response is being captured higher up the pipeline by ASP.NET wherein Forms Authentication is applied.  What Forms Authentication does is to trap any 401 Unauthorized response and to change it into a 302 redirect to send the user/client back to the login page.  This works well for MVC Web Pages where attempts by an unauthenticated user to directly navigate to a URL that requires authentication will redirect the browser to a login page, allowing the user to login before being redirected back to the original requested resource.  Unfortunately, this doesn’t work so well for a Web API endpoint where we actually want the correct 401 Unauthorized response to be sent back to the client without any redirection.

Phil Haack wrote a blog post about this very issue, and the Update section at the top of that post shows that the ASP.NET team implemented a fix to prevent this exact issue.  It’s the SuppressFormsAuthenticationRedirect property on the HttpResponse object!

So, all is good, yes?   We simply set this property to True before returning our 401 Unauthorized response and we’re good, yes?

[AttributeUsage(AttributeTargets.Class | AttributeTargets.Method, AllowMultiple = false)]
public class TokenAuthorize : AuthorizationFilterAttribute
{
    // snip...
    public override void OnAuthorization(System.Web.Http.Controllers.HttpActionContext actionContext)
    {
        var authHeader = actionContext.Request.Headers.Authorization;
        if (authHeader == null || !IsTokenValid(authHeader.Parameter))
        {
            HttpResponse.SuppressFormsAuthenticationRedirect = true;
            actionContext.Response = actionContext.ControllerContext.Request.CreateErrorResponse(HttpStatusCode.Unauthorized, Constants.APIToken.MissingOrInvalidTokenErrorMessage);
        }
    }
}

Well, no.

You see, the SuppressFormsAuthenticationRedirect property hangs off the HttpResponse object.  The HttpResponse object is part of that System.Web assembly and it’s intimately tied into the underlying ASP.NET / IIS pipeline.  Our Web API controllers are all “hosted” on top of OWIN.  This, very specifically, divorces all of our code from the underlying server that hosts the Web API.  That actionContext.Response above isn't a HttpResponse object, it's a HttpResponseMessage object.  The HttpResponseMessage object is used by OWIN as it’s divorced from the underlying HttpContext (which is inherently tied into the underlying hosting platform – IIS) and as such, doesn’t contain, nor does it have access to a HttpResponse object, or the required SuppressFormsAuthenticationRedirect property that we desperately need!

There are a number of attempted “workarounds” that you could try in order to get access to the HttpContext object from within your OWIN-compliant Web API controller code, such as this one from Hongmei at Microsoft:

HttpContext context;
Request.Properties.TryGetValue<HttpContext>("MS_HttpContext", out context);

Apart from this not working for me, this seems quite nasty and “hacky” at best, relying upon a hard-coded string that references a request “property” that just might contain the good old HttpContext.  There’s also other very interesting and useful information contained within a Stack Overflow post that gets closer to the problem, although the suggestions to configure the IAppBuilder to use Cookie Authentication and then to perform your own login in the OnApplyRedirect event will only work in specific situations, namely when you’re using the newer ASP.NET Identity, which itself, like OWIN, was designed to be disconnected from the underlying System.Web / IIS host.  Unfortunately, in my case, the MVC pages were still using the older ASP.NET Membership system, rather than the newer ASP.NET Identity.

So, how do we get around this?

Well, the answer lies within the setup and configuration of OWIN itself.  OWIN allows you to configure and plug-in specific “middleware” within the OWIN pipeline.  This allows all requests and responses within the OWIN pipeline to be inspected and modified by the middleware components.  It was this middleware that was being configured within the Stack Overflow suggestion of using the app.UseCookieAuthentication.  In our case, however, we simply want to inject some arbitrary code into the OWIN pipeline to be executed on every OWIN request/response cycle.

Since all of our code to setup OWIN for the Web API is running within an MVC project, we do have access to the System.Web assembly’s objects.  Therefore, the fix becomes the simple case of ensuring that our OWIN configuration contains a call to a piece of middleware that wraps a Func<T> that merely sets the required SuppressFormsAuthenticationRedirect property to true for every OWIN request/response:

// Configure WebAPI / OWIN to suppress the Forms Authentication redirect when we send a 401 Unauthorized response
// back from a web API.  As we're hosting out Web API inside an MVC project with Forms Auth enabled, without this,
// the 401 Response would be captured by the Forms Auth processing and changed into a 302 redirect with a payload
// for the Login Page.  This code implements some OWIN middleware that explicitly prevents that from happening.
app.Use((context, next) =>
{
    HttpContext.Current.Response.SuppressFormsAuthenticationRedirect = true;
    return next.Invoke();
});

And that’s it!

Because this code is executed from the Startup class that is bootstrapped when the application starts, we can reference the HttpContext object and ensure that OWIN calls execute our middleware, which is now running in the context of the wider application and thus has access to the HttpContext object of the MVC project’s hosting environment, which now allows us to set the all-important SuppressFormsAuthenticationRedirect property!

Here’s the complete Startup.cs class for reference:

[assembly: OwinStartup("Startup", typeof(Startup))]
namespace SampleProject
{
    public class Startup
    {
        public void Configuration(IAppBuilder app)
        {
            ConfigureWebAPI(app);
        }
        private void ConfigureWebAPI(IAppBuilder app)
        {
            var config = new HttpConfiguration();

            // Snip of other configuration.

            
            // Configure WebAPI / OWIN to suppress the Forms Authnentication redirect when we send a 401 Unauthorized response
            // back from a web API.  As we're hosting out Web API inside an MVC project with Forms Auth enabled, without this,
            // the 401 Response would be captured by the Forms Auth processing and changed into a 302 redirect with a payload
            // for the Login Page.  This code implements some OWIN middleware that explcitly prevents that from happening.
            app.Use((context, next) =>
            {
                HttpContext.Current.Response.SuppressFormsAuthenticationRedirect = true;
                return next.Invoke();
            });

            app.UseWebApi(config);
        }
    }
}

MVC Razor Views and automated Azure deployments

$
0
0

The other day, I decided that I’d publish a work-in-progress website to an Azure Website.  This was a free Website as part of the free package that Azure subscribers can use.  The website was a plain old vanilla ASP.NET MVC site.  Nothing fancy, just some models, some controllers , some infrastructure code and of course, some views.

I was deploying this to Azure via a direct connection to a private BitBucket Git repository I had within my BitBucket account.  This way, a simple “git commit” and “git push” would have, in a matter of seconds, my latest changes available for me to see in a real “on-the-internet” hosted site, thus giving me a better idea of how my site would look and feel than simply running the site on localhost.

An issue I almost instantly came up against was that, whenever I’d make a tiny change to just a view file – i.e. no code changes, just tweaks to HTML markup in the .cshtml Razor view file, and commit and push that change – the automated deployment process would kick in and the site would be successfully deployed to Azure, however, the newly changed Razor View would remain unchanged.

What on earth was going on?   I could run the site locally and see the layout change just fine, however, the version deployed to Azure was the older version, prior to the change.   I first decided that I now had to manually FTP the changed .cshtml files from my local machine to the relevant place in the Azure website, which did work, but was an inelegant solution.  I needed to find out why the changed Razor views were not being reflected in the Azure deployments.

image After some research, I found the answer.

Turns out that some of my view files (the .cshtml files themselves) had the “Build Action” property set to “None”.  In order for your view to be correctly deployed when using automated Azure deployments, the Build Action property must be set to “Content”.

Now, quite how the Build Action property had come to be set that way, I have no idea.  Some of the first views I’d added to the project had the Build Action set correctly, however, some of the newer views I’d added were set incorrectly.  I had not explicitly changed any of these properties on any files, so how some were originally set to Content and others set to None was a mystery.  And then I had an idea…..

I had been creating Views in a couple of different ways.  Turns out that when I was creating the View files using the Visual Studio context menu options, the files were being created with the correct default Build Action of Content.  However, when I was creating the View files using ReSharper’s QuickFix commands by hitting Alt + Enter on the line return View(); which shows the ReSharper menu allowing you to create a new Razor view with or without layout, it would create the View file with a default Build Action of None.

Bizarrely, attempting to recreate this issue in a brand new ASP.NET MVC project did not reproduce the issue (i.e. the ReSharper QuickFix command correctly created a View with a default Build Action of Content!)

I can only assume that this is some strange intermittent issue with ReSharper, although it’s quite probably caused by a specific difference between the projects that I’ve tested this on, thus far I have no idea what that difference may be…   I’ll keep looking and if I find the culprit, I’ll post an update here.

Until then, I’ll remain vigilant and always remember to double-check the Build Action property and ensure it’s set to Content for all newly created Razor Views.


DDD North 2015 In Review

$
0
0
IMG_20151024_082240

On Saturday 24th October 2015, DDD North held its 5th annual Developer Developer Developer event.  This time the event was held in the North-East, at the University of Sunderland.

As is customary for me now, I had arrived the evening before the event and stayed with family in nearby Newcastle-Upon-Tyne.  This allowed me to get to the University of Sunderland bright and early for registration on the morning of the event.

IMG_20151024_083559 After checking in and receiving my badge, I proceeded to the most important area of the communal reception area, the tea and coffee urns!  After grabbing a cup of coffee and waiting patiently whilst further attendees arrived, there was soon a shout that breakfast was ready.  Once again, DDD North and the University of Sunderland provided us all with a lovely breakfast baguette, with a choice of either bacon or sausage.  

After enjoying my bacon baguette and washing it down with a second cup of coffee, it was soon time for the first session of the day. The first session slot was a tricky one, as all of the five tracks of sessions appealed to me, however, I could only attend one session, so decided somewhat at the last minute it would be Rik Hepworth’sThe ART of Modern Azure Deployments.

IMG_20151024_093119 The main thrust of Rik’s session is to explain Azure Resource Templates (ART).  Rik says he’s going to explain the What, the Why and the How of ART’s.  Rik first reminds us that every resource in Azure (from virtual networks, to storage accounts, to complete virtual machines) is handled by the Azure Resource Manager.  This manager can be used and made to perform the creation of resources in an ad-hoc manner using numerous fairly arcane PowerShell commands, however, for repeatability in creating entire environments of Azure resources, we need Azure Resource Templates.

Rik first explains the What of ART’s.  They’re quite simply a JSON format document that conforms to the required ART schema.  They can be split into multiple files, one which supplies the “questions” (i.e. the template of the required resource – say a virtual network) and the other file can supply the “answers” to fill-in-the-blanks of the question file. (i.e. the parameterized IP address range of the required virtual network).  They are idempotent too, which means that the templates can be run against the Azure Resource Manager multiple times without fear of creating more resources than are required or destroying resources that already exist.

Rik proceeds with the Why of ART’s.   Well, firstly since they’re just JSON documents and text files, they can be version controlled.  This fits in very nicely with the “DevOps” culture of “configuration as code”, managed and controlled in the same way as our application source code is.  And being JSON documents, they’re much easier to write, use and maintain than large and cumbersome PowerShell scripts composed of many PowerShell commands with difficult to remember parameters.  Furthermore, Rik tells us that, eventually, Azure Resource Templates will be the only way to manage and configure complete environments of resources within Azure in the future.

Finally, we talk about the How of ART’s.  Well, they can be composed with Visual Studio 2013/2015. The only other tooling required is the Azure SDK and PowerShell.  Rik does mentions some caveats here as the Azure Resource API – against which the ART’s run – is currently moving and changing at a very fast pace.  As a result of this, there’s frequent updates to both the Azure SDK and the version of PowerShell needed to support the latest Azure Resource API version.  It’s important to ensure you keep this tooling up-to-date and in sync in order to have it all work correctly.

Rik goes on to talk about how monitoring of running the resource templates has improved vastly.  We can now monitor the progress of a running (or previously run) template file from portal.azure.com and resource.azure.com, which is the Resource Manager in the Azure portal.  This shows the complete JSON of the final templates, which may have consisted of a number of “question” and “answer” files that are subsequently merged together to form the final file of configuration data.  From here, we can also inspect each of the individual various resources that have been created as part of running the template, for example, virtual machines etc.

Rik then mentions something called DSC.  This is Desired State Configuration.   This is now an engineering requirement for all MS products that will be cloud-based.  DSC effectively means that the “product” can be entirely configured by declarative things such as scripts, command line commands and parameters. etc.  Everything can be set and configured from here without needing to resort to any GUI.

IMG_20151024_095414 Rik talks about how to start creating your own templates.  He says the best place to start is probably the Azure Quickstart Templates that are available from a GitHub repository.  They contain both very simple templates to ease you into getting started with something simple, but also contain some quite complex templates which will help you should you need to create a template to deploy a complete environment of numerous resources.  Rik also mentions that next year will see the release of something called the “Azure Stack” which will make it even easier to create scripts and templates that will automate the creation and management of your entire IT infrastructure, both in the cloud and on-premise, too.

As well as supporting basic parameterization of values within an Azure Resource Template, you can also define entire sections of JSON that define a complete resource (i.e. an entire virtual machine complete with an instance of SQL Server running on it).  This JSON document can then be referenced from within other ART files, allowing individual resources to be scripted once and reused many times.  As part of this, Azure resources support many different types of extensions for extending state configuration into other products.  For example, there is an extension that allows an Azure VM to be created with an Octopus Deploy tentacle pre-installed, as well as an extension that allows a Chef client to be pre-installed on the VM, for example.

Rik shows us a sample layout of a basic Azure Resource Template project within Visual Studio.  It consists of 3 folders, Scripts, Templates and Tools.  There's a blank template in the template folder and this defines the basic "shape" of the template document.  To get started within a simple template for example, a Windows VM needs a Storage account (which can be an existing one, or can create new) and a Virtual Network before the VM can be created.

We can use the GUI tooling within Visual Studio to create the basic JSON document with the correct properties, but can then manually tweak the values required in order to script our resource creation.  This is often the best way to get started.  Once we’ve used the GUI tooling to generate the basics of the template, we can then remove duplication by "collapsing" lots of the properties and extracting them into separate files to be included within the main template script.  (i.e. deploy location is repeated for each and every VM.  If we’re deploying multiple VMs, we can remove this duplication by extracting into a separate file that is referenced by each VM).

One thing to remember when running and deploying ART’s, Rik warns us, is that the default lifetime of an Azure Access Token is only 1 hour.  Azure Access Tokens are required by the template in order to prove authorisation for creating Azure resources.  However, in the event that the ART is deploying a complete environment consisting of numerous resources, this can be a time-consuming process – often taking a few hours.  For this reason, it’s best to extend the lifetime of the Azure Access Tokens, at least during development of the templates, otherwise the tokens will expire during the running of the template, thereby making the resource creation fail.

Rik wraps us with a summary, and opens the floor to any questions.  One question that is posed is whether existing Azure Resources can be “reverse-engineered” to ART scripts.  Rik states that so long the existing resources are v2 resources (that have been created with Azure Resource Manager) then you can turn these resources into templates BUT, if existing resources are V1 (also known as Classic resources and created using the older Azure Service Management) they can't be reverse-engineered into templates.

IMG_20151024_105058 After a short coffee break back in the main communal area, it’s time for the second session of the day.  For this session, I decided to go with Gary Short’sDeep Dive into Deep Learning.

Gary’s session was all about the field of data science and of things like neural networks and deep learning.   Gary starts by asking who knows what Neural Networks are and asks what Deep Learning is and the difference between them.  Not very many people know the difference here, but Gary assures us that by the end of his talk, we all will.

Gary tells us that his talk’s agenda is about looking at Neural Networks, being the first real mechanism by which “deep learning” was first implemented, but how today’s “deep learning” has improved on the early Neural Networks.  We first learn that the phrase “deep learning” is itself far too broad to really mean anything!

So, what is a Neural Network?  It’s a “thing” in data science.  It’s a statistical learning model and can be used to estimate functions that can depend on a large number of inputs.  Well, that’s a rather dry explanation so Gary gives us an example.  The correlation between temperature over the summer months and ice cream sales over the summer months.  We could use a Neural Network to predict the ice cream sales based upon the temperature variance.  This is, of course, a very simplistic example and we can simply guess ourselves that as the temperature rises, ice cream sales would predictably rise too.  It’s a simplistic example as there’s exactly one input and exactly one output, so it’s very easy for us to reason about the outcome with really relying upon a Neural Network.  However, in a more realistic example using “big data”, we’d likely have hundreds if not many thousands of inputs for which we wish to find a meaningful output.

Gary proceeds to explain that a Neural Network is really a weighted directed graph.  This is a graph of nodes and the connections between those nodes.  The connections are in a specific direction, from one node to another, and that same node can have a connection back to the originating node.  Each connection has a “weight” or a probability.  In the diagram to the left we can see that node A has a connection to node E and also a separate connection to node F.  The “weight” of the connection to node F is 0.9 whilst the weight of the connection to node E is 0.1.  This means there a 10% chance that a message or data coming from node A will be directed to node E and a 90% chance that a message coming from node A will be directed to node F.  The overall combination of nodes and connections between the nodes overall gives us the Neural Network.

Gary tells us how Neural Networks are not new, they were invented in 1943 by two mathematicians, Warren McCulloch and Walter Pitts.  Back then, they weren’t referred to as Neural Networks, but were known as “Threshold Logic”.  Later on, in the late 1940's, Donald Hebb created a hypothesis based on "neural plasticity" which is the ability of a Neural Network to be able to “heal itself” around “injuries” or bad connections between nodes.  This is now known as Hebbian Learning.  In 1958, mathematicians Farley and Wesley A. Clark used a calculator to simulate a Hebbian Machine at MIT (Massachusetts Institute of Technology).

So, just how did today’s “Deep Learning” improve upon Neural Networks.  Well, Neural Networks originally had two key limitations.  Firstly, they couldn't process exclusive or (XOR) logic in a single-layer network, and secondly, computers (or rather calculators) simply weren't really powerful enough to perform the extensive processing required.  Eventually, in 1975, a mathematician named Werbos discovered something called “back propagation”, which is the backwards propagation of error states allowing originating nodes to learn of errors further down a processing chain and perform corrective measures (self-learning) to mitigate further errors.  Back propagation helped to solve the XOR problem.  It was only through the passage of a large amount of time, though, that yesterday’s calculators became today’s computers – which got ever more powerful with every passing year – and allowed Neural Networks to come into their own.  So, although people in academia were teaching the concepts of Neural Networks, they weren’t really being used, preferring instead alternative learning mechanisms like “Support Vector Machines” (SVM) which could work with the level of computing that was available at that time.  With the advent of more powerful computers, however, Neural Networks really started to take off after the year 2000.

So, as Neural Networks started to get used, another limitation was found with them.  It took a long time to “train” the model with the input data.  Gary tells us of a Neural Network in the USA, used by the USPS (United States Postal Service) that was designed to help recognise hand-written zip codes.  Whilst this model was effective at it’s job, it took 3 full days to train the model with input data!  This had to be repeated continually as new “styles” of hand-writing needed to be recognised by the Neural Network.

Gary continues by telling us that by the year 2006, the phrase “deep learning” had started to take off, and this arose out of the work of two mathematicians called Geoffrey Hinton and Ruslan Salakhutdinov and showed that many-layered, feed-forward Neural Networks could be trained far more effectively, thus reducing the time required to train the network.  So really, “deep learning” is really just modern day Neural Networks, but ones that have been vastly improved over the original inventions of the 1940’s. 

Gary talks about generative models and stochastic models.   Generative models will “generate” things in a random way, whilst stochastic model will generate things in an unpredictable way. Very often this is the very same thing.  It’s this random unpredictability that exists in the problem of voice recognition.  This is now a largely “solved” problem from around 2010.  It’s given rise to Apple’s Siri, Google’s Google Now and most recently, and apparently most advanced, Microsoft’s Cortana.

At this point, Gary shows us a demo of some code that will categorise Iris plants based upon a diverse dataset of a number of different criteria.  The demo is implemented using the F# language, however, Gary states that the "go to" language for Data Science is R.  Gary says that whilst it’s powerful, it not a very nice language and this is primarily put down to the fact that whilst languages like C, C#, F# etc. are designed by computer scientists, R is designed by mathematicians.  Gary’s demo can use F# as it has a “type provider” mechanism which allows it to “wrap” and therefore largely abstract away the R language.  This can be downloaded from NuGet and you’ll also need the FsLab NuGet package.

IMG_20151024_113640 Gary explains that the categorisation of Irises is the canonical example for data science categorisation.  He shows the raw data and how the untrained system initially thinks that there are three classifications of irises when we know there's only really two.  Gary then explains that, in order to train our Neural Network to better understand our data, we need to start by "predicting the past".  This is simply what is says, for example, by looking at the past results of (say) football matches, we can use that data to to help predict future results.

Gary continues and shows how after "predicting the past" and using the resulting data to then train the Neural Network, we can once again examine the original data.  The graph this time is correctly showing only two different categorisations of irises.  Looking closer at the results and we can see that of a data set that contains numerous metrics for 45 different iris plants, our Neural Network was able to correctly classify 43 out of the 45 irises, with only two failures.  Looking into the specific failures, we see that they were unable to be classified due to their data being very close between the two different classifications.  Gary says how we could probably “fine tune” our Neural Network by looking further info the data and could well eradicate the two classification failures.

IMG_20151024_104709 After Gary’s session, it’s time for another tea and coffee break in the communal area, after which it’s time for the 3rd and final session before lunch.  There had been a couple of last-minute cancellations of a couple of sessions due to speaker ill health, and one of those sessions was unfortunately the one I had wanted to attend in this particular time slot, Stephen Turner’s “Be Reactive, Think Reactive”.  This session was rescheduled with Robert Hogg delivering a presentation on Enterprise IoT (Internet of Things), however, the session I decided to attend, was Peter Shaw’sMicroservice Architecture, What It Is, Why It Matters And How To Implement It In .NET.

Peter starts his presentation with a look at the talk’s agenda.   He’s going to define what Microservices are and their benefits and drawbacks.  He’ll explain how, within the .NET world, OWIN and Katana help us with building Microservices, and finally he is going to show a demo of some code that uses OWIN running on top of IIS7 to implement a Microservice.

IMG_20151024_120837 Firstly, Peter tells us that Microservices are not a software design pattern, they’re an architectural pattern.  They represent a 100-foot view of your entire application, not the 10-foot view, and moreover, Microservices provide a set of guidelines for deployment of your project.

Peter then talks about monolithic codebases and how we scale them by duplicating entire systems.  This can be wasteful when you only need to scale up one particular module as you’ll end up duplicating far more than you need.  Microservices is about being able to scale only what you need, but you need to find the right balance of how much to break down the application into it’s constituent modules or discreet chunks of functionality.  Break it down too much and  you'll get nano-services – a common anti-pattern - and will then have far too much complexity in managing too many small parts.  Break it down too little, and you’re not left with Microservices.  You’ve still got a largely monolithic application, albeit a slightly smaller one.

Next, Peter talks about how Microservices communicate to each other.  He states how there’s two schools of thought to approaching the communication problem.  One school of thought is to use an ESB (Enterprise Service Bus).  The benefits of using an ESB are that it’s a robust communications channel for all of the Microservices, however, a drawback is that it’s also a single point of failure.  The other school of thought is to use simple RESTful/HTTP communications directly between the various Microservices.  This removes the single point of failure, but does add the overhead of requiring the ability of each service to be able to “discover” other services (their availability and location for example) on the network.  This usually involves an additional tool, something like Consul, for example.

Some of the benefits of adopting a Microservices architecture are that software development teams can be formed around each individual service.  These would be full teams with developers, project managers etc. rather than having specific technical silos within one large team.  Other benefits are that applications become far more flexible and modular and can be composed and changed easily by simply swopping out one Microservice for another.

Some of the drawbacks of Microservices are that they have a potentially higher maintenance cost as your application will often be deployed across different and more expansive platforms/servers.  Other drawbacks are the potential for “data islands” to form.  This is where your application’s data becomes disjointed and more distributed due to the nature of the architecture.  Furthermore, Microservices, if they are to be successful, will require extensive monitoring.  Monitoring of every available metric of the applications and the communications between them is essential to enable effective support of the application as a whole.

After this, Peter moves on to show us some demo code, built using OWIN and NancyFX.  OWIN is the Open Web Interface for .NET and is an open framework for decoupling .NET web applications from the underlying web server that powers the application.  Peter tells us that Microsoft’s own implementation of the OWIN standard is called KatanaNancyFX is a lightweight web framework for .NET, and is built on top of the OWIN standard, thus decoupling the Nancy code from the underlying web server (i.e. there’s no direct references to HttpContext or other such objects).

Peter shows us how simple some of Nancy’s code can be:

public dynamic Something(){
    var result = GetSomeData();
    return result==null ? 404 : Result.AsJson();    
}

The last line of the code is most interesting.   Since the method returns a dynamic type, returning an integer that has the same value as a HTTP Status Code will be inferred by the Nancy framework to actually return that status code from the method!

Peter shows us some more code, most of which is very simple and tells us that the complete demo example is available to download from GitHub.

IMG_20151024_130929 After Peter’s talk wrapped up, it was time for lunch.  Lunch at the DDD events is usually a “brown bag” affair with a sandwich, crisps, some fruit and/or chocolate.  The catering at DDD North, and especially at the University of Sunderland, is always excellent and this year was no exception.   There was a large selection of various combinations of crisp flavours, chocolate bars and fruit along with a large selection of very nice sandwiches, including some of the more “basic” sandwich fillings for fusspots like me!  I don’t like mayonnaise, so pre-packed sandwiches are usually a tricky proposition.  This year, though, they had “plain” cheese and ham sandwiches with no additional condiments, which was excellent for me.

The excellent food was accompanied by a drink. I opted for water.  After collecting my lunch, I went off to find somewhere to sit and eat as well as somewhere that would be fairly close to a power point as I needed to charge my laptop.

IMG_20151024_131246 I duly found a spot that allowed me to both eat my lunch, charge my laptop and look out of the window onto the River Wear at what was a very nice day outside in sunny Sunderland!

IMG_20151024_131609 After fairly quickly eating my lunch, it was time for some lunchtime Grok Talks.  These are the 15-minute, usually fairly informal talks that often take place over lunch hour at many of these type of conferences and especially at DDD conferences.  During the last few DDD’s that I’d attended, I’d missed most of the Grok Talks for various reasons, but today, having already consumed my delicious lunch, I decided that I’d try to take them in.

By the time I’d reached the auditorium for the Grok Talks, I’d missed the first few minutes of the talk by Jeff Johnson all about Microsoft Azure and the role of Cloud Solution Architect at Microsoft.

Jeff first describes what Azure is, and explains that it’s Microsoft’s cloud platform offering numerous services and resources to individuals and companies of all sizes to be able to host their entire IT infrastructure – should they so choose – in the cloud. 

IMG_20151024_133830 Next, Jeff shows us some impressive statistics on how Azure has grown in only a few short years.  He says that the biggest problem that Microsoft faces with Azure right now is that they simply can’t scale their infrastructure quick enough to keep up with demand.  And it’s a demand that is continuing to grow at a very fast rate.  He says that Microsoft’s budget on expenditure for expanding and growing Azure is around 5-6 billion dollars per annum, and that Azure has a very large number of users even today.

Jeff proceeds by talking about the role of Cloud Solutions Architect within Microsoft.  He explains that the role involves working very closely with customers, or more accurately potential customers to help find projects within the customers’ inventory that can be migrated to the cloud for either increased scalability, general improvement of the application, or to make the application more cost effective.  Customers are not charged for the services of a Cloud Solutions Architect, and the Cloud Solutions Architects themselves seek out and identify potential customers to see if they can be brought onboard with Azure.

Finally, Jeff talks about life at Microsoft.  He states how Microsoft in the UK has a number of “hubs”, one each in Edinburgh, Manchester and London, but that Microsoft UK employees can live anywhere.  They’ll use the “hub” only occasionally, and will often work remotely, either from home or from a customer’s site.

After Jeff’s talk, we had Peter Bull and his In The Groove talk all about developing for Microsoft’s Groove Music.  Peter explains that Groove Music is Microsoft’s equivalent to Apple’s iTunes and Google’s Google Play Music and was formerly called Xbox Music.  Peter states that Groove Music is very amenable to allowing developers to create new applications using Groove Music as it offers both an API and an SDK.  The SDK is effectively a wrapper around the raw API.  Peter then shows us a quick demo of some of the nice touches to the API which includes the retrieval of album artwork.  The API allows retrieving album artwork of varying sizes and he shows us how, when requesting a small version of some album artwork that, for example, contains a face, the Groove API will use face detection algorithms to ensure that when dynamically resizing the artwork, the face remains visible and is not cropped out of the picture.

IMG_20151024_140031 The next Grok talk was by John Stovin and was all about a unit testing framework called Fixie.  John starts by asking, Why another unit testing framework?  He explains that Fixie is quite different from other unit testing frameworks such as NUnit or xUnit.  The creator of Fixie, Patrick Lioi, stated that he created Fixie as he wanted as much flexibility in his unit testing framework as he had with other frameworks he was using in his projects.  To this end, Fixie does not ship with any assertion framework, unlike NUnit and xUnit, allowing each Fixie user to choose his or her own Assertion framework.  Fixie is also very simple in how you author tests.   There’s no [Test] style attributes, no using Fixie statements at the top of test classes.   Each test class is simply a standard public class and each test method is simply a public method whos name ends in “Test”.  Test setup and teardown is similar to xUnit in that it simply uses the class constructor and Dispose methods to perform these functions.

IMG_20151024_140406 Interestingly, Fixie tests can inherit from a “Convention” base class which can change the behaviour of Fixie.  For example, a custom convention class can be implemented very simply to alter the behaviour of Fixie to be more like that of NUnit, with test classes decorated by a [TestFixture] attribute and test methods decorated by a [Test] attribute.  Conventions can control the discovery of tests, how tests are parameterized, how tests are executed and also how test output is displayed.

Fixie currently has lots of existing test-runners, including a command-line runner and a runner for the Visual Studio test explorer.  There’s currently a plug-in to allow ReSharper 8 to run Fixie tests, and a new plug-in/extension is currently being developed to work with ReSharper 10.  Fixie is open-source and is available on GitHub.

After John’s talk, we had the final Grok Talk of the lunch time, which was Steve Higgs’sES6 Right Here, Right Now.  Steve’s talk is how developers can best use and leverage ES6 (ECMAScript 6 aka JavaScript 2015) today.  Steve starts by stating that, contrary to some beliefs, ES6 is no longer the “next” version of JavaScript, but is actually the “current” version.  The standard has been completely ratified, but most browsers don’t yet fully support it.

Steve talks about some of the nice features of ES6, many of which had to be implemented with 3rd-party libraries and frameworks.  ES6 has “modules” baked right in, so there’s no longer any need to use a 3rd-party module manager.  However, if we’re targeting today's browsers and writing JavaScript targeting ES5, we can use 3rd-party libraries to emulate these new ES6 features (for example, require.js for module management).

Steve continues by stating that ES6 will now (finally) have built-in classes.  Unfortunately, they’re not “full-featured” classes like we get in many other languages (such as C#, Java etc.) as they only support constructors and public methods, and have no support for things like private methods yet.  Steve does state that private methods can be “faked” in a bit of a nasty, hacky way, but ES6 classes definitely do not have support for private variables.  Steve states that this will come in the future, in ES7.

ES6 gets “arrow functions”, which are effectively lambda functions that we know and love from C#/LINQ, for example:

var a = [
  "Hydrogen",
  "Helium",
  "Lithium",
  "Beryl­lium"
];

// Old method to return length of each element.
var a2 = a.map(function(s){ return s.length });

// New method using the new "arrow functions".
var a3 = a.map( s => s.length );

Steve continues by stating that ES6 introduces the let and const keywords.  let gives a variable block scoping rather than JavaScript’s default function scoping.  This is a welcome addition, and helps those of us who are used to working with languages such as C# etc. where our variable scoping is always block scoped.  const allows JavaScript to declare a constant.

ES6 now also has default parameters which allow us to define a default value for a function’s parameter in the event that code calling the function does not supply a value:

function doAlert(a=1) {
    alert(a);
}

// Calling doAlert without passing a value will use the
// default value supplied in the function definition.
doAlert();

Steve also mentions how ES6 now has string interpolation, also known a “template strings”, so that we can finally write code such as this:

// Old way of outputting a variable in a string.
var a = 5;
var b = 10;
console.log("Fifteen is " + (a + b) + " and\nnot " + (2 * a + b) + ".");

// New ES6 way with string interpolation, or "template strings".
var a = 5;
var b = 10;
console.log(`Fifteen is ${a + b} and\nnot ${2 * a + b}.`);

One important point to note with string interpolation is that your string must be quoted using backticks (`) rather than the normal single-quote (‘) or double-quote (“) characters!  This is something that will likely catch a lot of people out when first using this new feature.

Steve rounds off his talk by stating that there’s lots of other features in ES6, and it’s best to simply browse through them all on one of the many sites that detail them.  Steve says that we can get started with ES6 today by using something like babeljs.io, which is a JavaScript compiler (or transpiler) and allows you to transpile JavaScript code that targets ES6 into JavaScript code that is compatible with the ES5 that is fully supported by today’s browsers.

After Steve’s talk, the Grok Talks were over, and with it the lunch break was almost over too.  There was a few minutes left to head back to the communal area and grab a cup of coffee and a bottle of water to keep me going through the two afternoon sessions and the two final sessions of the day.

IMG_20151024_143656 The first session of the afternoon was another change to the advertised session due to the previously mentioned cancellations.  This session was Pete Smith’sBeyond Responsive Design.  Pete’s session was aimed at design for modern web and mobile applications.  Pete starts with looking at a brief history of web development.  He says that the web started solely on the desktop and was very basic at first, but very quickly grew to become better and better.  Eventually, the Smartphone came along and all of these good looking desktop websites suddenly didn’t look so good anymore.

So then, Responsive Design came along.  This attempted to address the disconnect and inconsistencies between the designs required for the desktop and designs required for the mobile.  However, Responsive Design brought with it it’s own problems.  Our designs became awash with extensive media queries in order to determine which screen size we were rendering for, as well as became dependent upon homogenous (and often large) frameworks such as Zurb’s Foundation and Bootstrap.  Pete says that this is the focus of going “beyond” responsive design.  We can solve these problems by going back to basics and simplifying what we do.

So, how do we know if we've got a problem?  Well, Pete explains that there are some sites that work great on both desktop and mobile, but overall, they’re not as widespread as we would like given where we are in our web evolution.  Pete then shows some of the issues.  Firstly, we have what Pete calls the "teeny tiny" problem.  This is  where the entire desktop site is scaled and shrunk down to display on the smaller mobile screen size.  Then there's another problem that Pete calls "Indiana’s phone and the temple of zoom" which is where a desktop site, rendered on a mobile screen, can be zoomed in continuously until it becomes completely unusable.

Pete asks “what is a page on today’s modern web?”  Well, he says there’s no such thing as a single-page application.  There’s really no difference between SPA’s and non-SPA sites that use some JavaScript to perform AJAX requests to retrieve data from the server.  Pete states that there’s really no good guiding design principles.  When we’re writing apps for Android or iOS, there’s a wealth of design principles that developers are expected to follow, and it’s very easy for them to do so.  A shining example of this is Google’s Material Design.  When we’re designed for the web, though, not so much.

Dynamic-Data-Maksing-IbizaSo how do we improve?  Pete says we need to “design from the ground up”.  We need to select user-interface patterns that work well on both the desktop and on mobile.  Pete gives examples and states that UI elements like modal pop-ups and alerts work great on the desktop, but often not so well on mobile.  An example of a UI pattern that does work very well on both platforms are the “panes” (sometimes referred to as property sheets) that slide in from the side of the screen.  We see this used extensively on mobile due to the limited screen real estate, but not so much on the desktop, despite the pattern working well here also.  A great example of effective use of this design pattern is the new Microsoft Azure Preview Portal.  Pete states we should avoid using frameworks like Bootstrap or Foundation.  We should do it all ourselves and we should only revert to “responsive design” when there is a specific pattern that clearly works better on one medium than another and where no other pattern exists that works well on all mediums.

At this point in the talk, Pete moves on to show us some demo code for a website that he’s built to show off the very design patterns and features that he’s been discussing.  The code is freely available from Pete’s GitHub repository.  Pete shows his website first running on a desktop browser, then he shows the same website running on an iPad and then on a Smartphone.  Each time, due to clever use of design patterns that work well across screens of differing form factors, the website looks and feels very similar.  Obviously there are some differences, but overall, the site is very consistent.

Pete shows the code for the site and examines the CSS/LESS styles.  He says that absolute positioning for creating these kind of sites is essential.  It allows us to ensure that certain page elements (i.e.. the left-hand menu bar) are always displayed correctly and in their entirety.  He then shows how he's used CSS3 transforms to implement the slide in/out panels or “property sheets”, simply transforming them with either +100% or -100% of their horizontal positioning to display to the left or right of the element’s original, absolute position.  Pete notes how there’s extensive use of HTML5 semantic tags, such as <nav> <content> <footer> etc.  Pete reminds us that there’s no real behaviour attached to using these tags but that they make things far easier to reason about than simply using <div> tags for everything.

Finally, Pete summarises and says that if there’s only one word to take away from his talk it’s “Simplify”.  He talks about the future and mentions that the next “big thing” to help with building sites that work well across all of the platforms that we use to consume the web is Web Components.  Web Components aid encapsulation and re-usability.  They’re available to use today, however, they’re not yet fully supported.  In fact, they are only currently supported in Chrome and Opera browsers and need a third-party JavaScript library, Polymer.js, in order to work.

IMG_20151024_155657 The final session of the day was Richard Fennell’sMonitoring and Addressing Technical Debt With SonarQube.

Richard starts his session by defining technical debt.  He says it’s something that builds up very slowly, almost sneaks up on you.  It’s the little “cut corners” of our codebases where we’ve implemented code that seems to do the job, but is sub-optimal code.  Richard says that technical debt can grow to become so large that it can stop you in your tracks and prevent you from progressing with a project.

He then discusses the available tools that we currently have to address technical debt, but specifically within the world of Microsoft’s tooling.  Well, firstly we have compiler errors.  These are very easy to fix as we simply can’t ship our software with compiler errors, and they’ll provide immediate feedback to help us fix the problem.  Whilst compiler errors can’t be ignored, Richard says that it’s really not uncommon to come across projects that have many compiler warnings.  Compiler warnings aren’t errors as such, and as they don’t necessarily prevent us from shipping our code, we can often live with them for a long time.  Richard mentions the tools Visual Studio Code Analysis (previously known as FXCop) and StyleCop.  Code Analysis/FxCop works on your compiled code to determine potential problems or maintenance issues with the code, whilst StyleCop works on the raw source code, analysing it for style issues and conformance against a set of coding standards.  Richard says that both of these tools are great, but offer only a simple “snapshot in time” of the state of our source code.  What we really need is a much better “dashboard” to monitor the state of our code.

Richard asks, “So what would Microsoft do?”.  He continues to explain that the “old” Microsoft would go off and create their own solution to the problem, however, the “new” Microsoft, being far more amenable to adopting already-existing open source solutions, has decided to adopt the existing de-facto standard solution for analysing technical debt, a product called SonarQube by SonarSource.

Richard introduces SonarQube and states that, firstly, we must understand that it’s a Java based product.  This brings some interesting “gotchas” to the .NET developers when trying to set up a SonarQube solution as we’ll see shortly.  Richard states that SonarQube’s architecture is based upon it having a backend database to store the results of its analysis, and it also has plug-in analyzers that analyze source code.  Of course, being a Java-based product, SonarQube’s analyzers are written in Java too.  The Analyzers examine our source code and create data that is written into the SonarQube database.  From this data, a web-based front-end part of the SonarQube product can render a nice dashboard of this data in ways that help us to "visualise" our technical debt.   Richard points out that analyzers exist for many different languages and technologies, but he also offers a word of caution.  Not all analyzers are free and open source.  He states that the .NET ones currently are but (for example) the COBOL & C++ analyzers have a cost associated with them.

Richard then talks about getting an installation of SonarQube up and running.  As it’s a Java product, there’s very little in the way of nice wizards during the installation process to help us.  Lots of the configuration of the product is performed via manual editing of configuration files.  Due to this, Microsoft’s ALM Rangers group have produced a very helpful guide to help in installing the product.  The system requirements for installing SonarQube are a server running either Windows or Linux with a minimum of 1GB of RAM.  The server will need to have .NET framework 4.5.2 installed also, as this is required by the MSBuild runner which is used to run the .NET analyzer.  As it’s a Java product, obviously, Java is required to be installed on the server – either Oracle’s JRE 7 (or higher) or OpenJDK 7 (or higher).  For the required backend database SonarQube will, by default, install a database called H2, however this can be changed (and probably should be changed) to something more suited to .NET folks such as Microsoft’s SQL Server.  It’s worth noting that the free SQL Server Express will work just fine also.  Richard points out that there are some “gotchas” around the setup of the database, too.  As a Java-based product, SonarQube will be using JDBC drivers to connect to the database, and these place some restrictions on the database itself.  The database must have it’s collation set to Case Sensitive (CS) and Accent Sensitive (AS).  Without this, it simply won’t work!

After setup of the software, Richard explains that we’ll only get an analyzer and runner for Java source code out-of-the-box.  From here we can download and install the analyzer and runner we’ll need for analyzing C# source code.  He then shows how we need to add a special file called sonar-project.properties to the root of our project that will be analyzed.  This file contains four key values that are required in order for SonarQube to perform it’s analysis.  Ideally, we’d set up our installation of SonarQube on a build server, and there we’d also edit the SonarQube.Analyzers.xml file to reflect the correct database connection string to be used.

image3 Richard now moves onto showing us a demo.  He uses the OWASP demo project, WebGoat.NET for his demonstration.  This is an intentionally “broken” ASP.NET application which will allow SonarQube to highlight multiple technical debt issues with the code.  Richard shows SonarQube being integrated into Visual Studio Team Foundation Server 2015 as part of its build process.  Richard further explains that SonarQube analyzers are based upon processing complete folders or wildcards for file names.  He shows the  default SonarQube dashboard and explains how most of the errors encountered can often be found in the various “standard” libraries that we frequently include in our projects, such as jQuery etc.  As a result of this, it’s best to really think about how we structure our solutions as it’s beneficial to keep third-party libraries in folders separate from our own code.  This way we can instruct SonarQube to ignore those folders.

Richard shows us the rules that exist in SonarQube.  There are a number of built-in rules provided by SonarQube itself, but the C# analyzer plug-in will add many of it’s own.  These built-in SonarQube rules are called the “Sonar Way” rules and represent the expected Sonar way of writing code.  These are very Java-centric so may only be of limited use when analyzing C# code.  The various C# rule-sets are obviously more aligned with the C# language.  Some rules are prefixed with “CA” in the rule-set list and these are the FxCop rules, whilst other rules are prefixed with “S” in the rule-set list.  These are the C# language rules and use Roslyn to perform the code analysis (hence the requirement for the .NET framework 4.5.2 to be installed)

Richard continues by showing how we can set up “quality gates” to show if one of our builds is passing or failing in quality.  This is an analysis of our code by SonarQube as part of a build process.  We can set conditions on the history of the analyses that have been run to ensure that, for example, each successive build should have no more than 98% of known bugs of the previous release.  This way, we can reason that our builds are getting progressively better in quality each time.

Finally, Richard sums up by introducing a new companion product to SonarQube called SonarLint. SonarLint is based upon the same .NET Compiler platform, Roslyn, that provides SonarQube’s analysis, however SonarLint is designed to be run inside the Visual Studio IDE and provides near real-time instant feedback on the quality of our source code as we’re editing it.  SonarLint is open source and available on Github.

IMG_20151024_170210 After Richard’s talk was over, it was time for all of the conference attendees to gather in the main lecture hall for the final wrap-up presentation.  During the presentation, the various sponsors were thanked for all of their support.  The conference organisers did also mention how there had been a number of “no-shows” to the conference (people who’d registered to attend but had not shown up on the day and hadn’t cancelled their tickets despite repeated communication requesting people who can no longer attend to do so).  The organisers told us how every no-show not only costs the conference around £15 per person but also prevents those who were on the waiting list from being able to attend, and there was quite an extensive waiting list for the conference this year.  Here’s hoping that future DDD Conferences have less no-shows.

It was mentioned that DDD North is now the biggest of all of the regional DDD events, with some 450 (approx.) attendees this year – a growth on last year’s numbers – with still over 100 more people on the waiting list.  The organisers told us that they could, if it weren’t for space/venue size considerations, have run the conference for around 600-700 people.  That’s quite some numbers and shows just how popular the DDD conferences, and especially the DDD North conference, are.

2015-11-02 14_35_26-Technical whizzes set to share expertise in Sunderland - Sunderland EchoOne especially nice touch was that I did receive a quick mention from Andy Westgarth, the main organiser of DDD North, during the final wrap-up presentation for the use of one of my pictures for an article that had appeared in the local newspaper, the Sunderland Echo, that very day.  The picture used was one I’d taken in the same lecture hall at DDD North 2013, two years earlier.  The article is available to read online, too.

After the wrapping up came the prize draw.  As always, there was some nice prizes to be given away by both the conference organisers themselves as well as prizes to be given away by the individual sponsors including a Nexus 9 tablet, a Surface Pro 3 and a whole host of other goodies.  As was usual, I didn’t win anything, but I’d had a fantastic day at yet another superb DDD North.  Many thanks to the organisers and the various sponsors for enabling such a brilliant event to happen.

IMG_20151024_175352 But…  It wasn’t over just yet!   There is usually a “Geek Dinner” after the DDD conferences however on this occasion there was to be a food and drink reception graciously hosted by Sunderland Software City.  So as we shuffled out of the Sunderland University campus, I headed to my car to drive the short distance to Sunderland Software City.

IMG_20151024_175438 Upon arrival there was, unfortunately, no pop-up bar from Vaux Brewery as there had been two years prior.  This was a shame as I’m quite partial to a nice pint of real ale, however, the kind folks at Sunderland Software City had provided us with a number of buckets of ice-cold beers, wines and other beverages.  Of course, I was driving so I had to stick to the soft drinks anyway!

I was one of the first people to arrive at the Sunderland Software City venue as I’d driven the short distance from the University to get there, whereas most other people who were attending the reception were walking from the University.  I grabbed myself a can of Diet Coke and quickly got chatting to some fellow conference attendees sharing experiences about our day and getting to know each other and finding out what we do for a living and all about the work we do.

IMG_20151024_180512 Not too long after getting chatting, a few of the staff from the Centre were scurrying towards the door.  What we soon realised was that the “food” element of the “food & drink” reception was arriving.  We were being treated to what must be the single largest amount of pizza I’ve ever seen in one place.  76 delicious pizzas delivered from Pizza Hut!  Check out the photo of this magnificent sight!  (Oh, and those boxes of pizza are stacked two deep, too!)

So, once the pizzas had all been delivered and laid out for us on the extensive table top, we all got stuck in.  A few slices of pizza later and an additional can of Diet Coke to wash it down and it was back to mingling with the crowd for some more networking.

Before leaving, I managed to have a natter with Andy Westgarth, the main conference organiser about the trials and tribulations of running a conference like DDD North.  Despite the fact that Andy should be living and working in the USA by the time the next DDD North conference rolls around, he did assure me that the conference was in very safe hands and should continue on next year.

After some more conversation, it was finally time for me to leave and head off back to my in-laws in Newcastle.  And with that another superb DDD North conference was over.  Here’s looking forward to next year!

Stacked 2015 In Review

$
0
0

IMG_20151118_170749 On Wednesday 18th November 2015, the third Stacked event was held.  The Stacked events are community events primarily based around Windows development.  The events are free to attend and are organised by a collective group of folks from Mando Group and Microsoft UK with sponsorship from additional companies.  The last two Stacked events were held in Liverpool in 2013 and after a year off in 2014, Stacked returned in 2015 with an impressive line-up of speakers and talk and to a new venue at the Comedy Store at Deansgate Locks in Manchester.

Being in Manchester, it was only a short train ride for me to arrive bright and early on the morning of the conference.  Registration was taking place from 8:30am to 9:10am, and I’d arrived just around 9am.  After checking in and receiving my conference lanyard, I proceeded to the bar area where complimentary tea and coffee was on offer.  After only having time for a quick cup of coffee, we were called into the main area of the venue, which was the actual stage area of the comedy club, for the first session of the day.

The first session was Mike Taulty’sWindows 10 and the Universal Windows Platform for Modern Apps. Mike’s session was dedicated to showing us how simple it is to create applications on the Universal Windows Platform.  Mike first starts by defining the Universal Windows Platform (UWP).  The UWP is a way of writing applications using a combination of one of the .NET languages (C# or VB.NET) along with a specific “universal” version of the .NET runtime, known as .NET Core.  Mike explains that, as Windows 10 is available on so many devices and different categories of devices (PC’s, Laptops, Tablets, Phones and even tiny IoT devices such as Raspberry Pi’s!), the UWP sits “on top” of the different editions of Windows 10 and provides an abstraction layer allowing largely unified development on top of the UWP.  Obviously, not every “family” of devices share the same functionality, so API’s are grouped into “contracts”, with different “contracts” being available for different classes of device.

Building a UWP application is similar to how you might build a Windows WPF application.  We use XAML for the mark-up of the user interface, and C#/VB.NET for the code behind.  Similar to WPF applications, a UWP application has an app.xaml start-up class.  Mike decides he’s going to plunge straight into a demo to show us how to create a UWP application.  His demo application will be an application that connects, via Bluetooth, to a SpheroBall (which is a great toy – it’s a small motorized ball that can be “driven” wirelessly and has lights that can light up in various RGB colours).  Mike will show this same application running on a number of different devices.

IMG_20151118_091002 Mike first explains about the make-up and structure of a UWP application.  The files we’ll find inside a UWP app – such as assets, pictures, resources etc. - are separated by "device family" (i.e. PC, tablet, phone etc.) - so we’d have different versions of each image for each device family we're targeting.   Mike explains how UWP (really XAML)  applications can be "adaptive" – this is the same as a "responsive" web site.  Mike builds up his application using some pre-built snippets of code and fills in the blanks, showing us how using compiler directives we can have certain code only invoked if we’re running on a specific device.  Mike demos his app first on a laptop PC, then a Windows 10 phone and finally a Raspberry Pi.  Mike shows how we can deploy to, and control, the Raspberry Pi – which is running Windows 10 Core - by either remote PowerShell or alternatively, via a web UI built into Windows 10 Core on the device.

Mike says that when we’re building an app for IoT devices (such as the Raspberry Pi, Arduino etc.) we will often need a reference to an extension library that is specific to the Iot Core platform. This extension library, which is referenced from within our UWP project separately, allows access to additional types that wouldn't ordinarily exist within the UWP platform itself.  By checking such things as  Windows.Foundation.MetaData.ApiInformation.IsAPIContractPresent we write code that only targets and is only invoked on specific classes of device.

Mike then shows us his application running on the RaspBerry Pi, but being controlled via a BlueTooth connected XBox One controller.  After this Mike explains that Windows 10, on devices equipped with touch-sensitive screens, has built in handwriting and “ink” recognition, so the demo proceeds to show the SpheroBall being controlled by a stylus writing on the touch-sensitive screen of Mike’s laptop.  Finally, Mike talks about Windows 10’s built-in speech recognition and shows us how, with only a few extra lines of code, we cannot control the SpheroBall via voice commands!

In rounding up, Mike mentions a new part of Windows 10, which is an open connectivity technology allowing network discovery of APIs, called "AllJoyn".  It's an open, cross-platform technology and Mike says how there’s even light bulbs you can currently buy that will connect to your home network via the AllJoyn technology so you can control your home lighting via network commands!

After Mike’s session, we all left the theatre area and went back to the main bar area where there was more tea and coffee available for our refreshments.  After a short 15-20 minutes break, we headed back to the theatre area to take our seats for the next session, which was Jeff Burtoft’sWindows 10 Web Platform.

Jeff starts by talking about the history of Internet Explorer with its Trident engine and Strict & Quirks mode - two rendering engines to render either quirks (i.e. old style, IE specific) or strict mode (to be more standard compliant).  Jeff says how this was Ok in the past as lots of sites were written specifically for Internet Explorer, but these days, we're pretty much all standards compliant.  As a result, Microsoft decided to completely abandon the old Internet Explorer browser and gave birth to the fully standards compliant Edge browser.   Jeff then shows a slide from a study done by a website called quirksmode.com which is all about the proliferation of different versions of Chromium-based browsers.  Chromium is used both by Google’s Chrome browser, but it’s also the basis for a lot of “stock” browsers that ship on smartphones.  Many of these browsers are rarely, if ever, updated.  Jeff states that some features of IE were actually implemented exactly to the HTML specification whilst other browser’s implementations weren't exactly compliant with the W3C specification.  These browsers are now far more common, but Jeff states how Microsoft, with the Edge browser, will render things "like other browsers" even if not quite to spec.  This creates a better parity between all possible browsers so that developing for web apps is more consistent across platforms.

Jeff shows a demo using the new Web Audio API and shows 3 different sound files being played on a web page and perfectly synchronised, each with their own volume controls.  Jeff then shows a demo of a FPS game in the browser and controlled by the XBox one controller.  The demo is using 3 major APIs for this.  WebGL, Web Audio API and XBOX Controller API and manages a very impressive 40-50 frames per second, even though Jeff’s laptop isn’t the fastest laptop and the demo is running entirely inside the browser.

Next, Jeff talks about how we can write a HTML/JavaScript app (ala Windows 8) that are HTML and JavaScript and can be "bundled" with the EdgeHTML.dll library (the rendering engine for Edge browser) and Chakra (the JavaScript engine of Edge browser).   Apps developed like this can be "packaged" and deployed to run just like a desktop application, or can be "hosted" by using a "WebView control" - this allows a web app on a phone to look and act almost exact like a native app.

Jeff then talks about a Microsoft developed, but open-source, JavaScript library called ManifoldJS.  This library is the simplest way to create hosted apps across platforms and devices.  It allows the hosted web app to be truly cross-platform across devices and browsers.  For example, packaging up your own HTML/JavaScript application using ManifoldJS would allow the same package to be deployed to the desktop, but also deployed to (for example) an Android-based smartphone where the same app would use the Cordova framework to provide native-like performance as well as allowing access to device specific features, such as GPS and other sensors etc.

Jeff demos packaging an application using ManifoldJS and creates a hosted web app, running as a "desktop" application on Windows 10, which has pulled down the HTML, CSS and JavaScript from a number of pages from the BBC Sport website including all assets (images etc.) and wrapped it up nicely into an application that runs in a desktop window and functions the same as the website itself.

Finally, Jeff also demos another hosted web app that uses his Microsoft Band and is responsive gesture controls to automate sending specific, pre-composed tweets whilst drinking beer!  :)

IMG_20151118_153356 After Jeff’s session, there was another break.   This time, we were treated to some nice biscuits to go with our tea and coffee!   After another 15 minutes it was time for the final session of the morning.  This one was slightly unusual as it had two presenters and was split into two halves.  The session was by Jonathan Seal& Mike Taulty and was Towards A More Personal Computing Experience.

Jonathan was first to the stage and started by saying that his idea behind making computing “more personal” is largely geared around how we interact with machines and devices.  He notes how interactions are, and have been until recently, very fixed – using a keyboard and mouse to control our computers has been the norm for many years.  Now, though, he says that we’re starting to open up new ways of interaction.  These are speech and gesture controls.  Jonathan then talks about something called the “technological teller”.  This is the phenomenon whereby man takes an old way of doing something and applies it to new technology.  He shows a slide which indicates that the very first motorcars used by the US Mail service were steered using a rudder-like device, extended to the front side of the vehicle but controlling the the rear wheels.  This was implemented as, at that time, we were used to “steering” something with a rudder as all we’d had before the car that needed steering was boats!  He explains how it was many years before the invention of the steering wheel and placing the steering controls closer to where the user would actually steer the vehicle.

Jonathan shows some videos of new gesture control mechanisms in new cars that are shortly coming onto the market.  He also shows a video of a person controlling a robotic ball (similar to the SpheroBall used earlier by Mike Taulty) by using advanced facial recognition, which not only detected faces, but could detect emotional expressions in order to control the robotic ball.  For example, with a “happy” expression; the ball would roll towards the user, whilst with a “sad” or “angry” expression, the ball would roll away from the user.

After these videos, Jonathan invites Mike Taulty to the stage to show some of the facial recognition in action.   Mike first talks about something called Windows Hello, which is an alternative mechanism of authentication rather than having to enter a password.  Windows Hello works primarily on facial recognition.

Mike proceeds to show a demo of some very simple code that targets the facial recognition SDK that exists within Windows 10 and which allows, using only a dozen or so lines of code, to get the rectangles around faces captured from the webcam.  Mike also shows that same image which can be sent to an online Microsoft Research project called Project Oxford, which further analyses the facial image and can detect all of the elements of the face (eyes, eyebrows, node, mouth etc.) as well as provide feedback on the detected expression shown on the face (i.e. Happy, sad, angry, confused etc.)  Using Project Oxford, you can, in real-time, not only detect things like emotion from facial expressions but can also detect the person’s heart rate from the detected facial data!

Mike says that the best detection requires a “depth camera”.  He has one attached to his laptop.  It’s an Intel RealSense camera which costs around £100.  Mike also shows usage of a Kinect camera to detect a full person with 25 points across all bodily limbs.  The Kinect camera can detect and track entire skeletal frame of the entire body.  From this, software can use not only facial expressions, but entire body gestures to control software.

Mike also shows an application that interacts with Cortana – Microsoft’s personal assistant.  Mike shows a demo of some simple software that he’s written that interacts with Cortana allowing Mike to prefix spoken commands with some specific words that allow Cortana to interact with Mike’s software so that specific logic can be performed.   Mike asks Cortana, "Picture Search - show me pictures of cats".  The “Picture Search” prefix is a specifically coded prefix which instructs Cortana to interact with Mike’s program.  From here pictures matching “Cats” are retrieved from the internet and displayed; however, using the facial and expression detection technology, Mike can narrow his search down to show only “happy cats”!

IMG_20151118_131429 After this session, it was lunchtime.  In previous years, lunchtime at the Stacked events was not catered and lunch was often acquired at a local sandwich shop.  However, this year, with the event being bigger and better, a lunch was provided.  And it was a lovely lunch, too!  Lunch at conferences such as these are usually “brown bag” affairs with a sandwich, crisps etc. however on this occasion, we were treated to a full plate of hot food!  There was a choice of 3 different curries, a vegetable curry and a mild and a spicy chicken curry.  All served with pilau rice, a naan bread along with dips, sides of salad and a poppadum!  After queueing for the food, I took a table downstairs where there was more room and enjoyed a very delicious lunch.

As I was anticipating having to provide me own lunch, I’d brought along some cheese sandwiches and a banana, but after the lovely curry for lunch, I decided the these would make a nice snack on my train ride home at the end of the day!

After our lunch-break, it was time for the first session of the afternoon and the penultimate session of the day.  This was Mary J. Foley’sMicrosoft & Developers – Now & Next.

Mary starts by saying that she’s not technical.  She’s a technology journalist, but she’s been following Microsoft for nearly 30 years.  She says that, with Windows 10, she really wanted to talk about 10 things.  But the more she tried to come up with 10 things; she could only come up with 3.  She says that firstly, there's been 3 CEO's of Microsoft.  And today, there are 3 business units - there used to be many more – Windows Division, Applications & Services Division and Cloud & Enterprise Division.  Mary says that previous CEO’s of Microsoft have “bet” on numerous things, some of which have not worked out.  With the current CEO, Satya Nadella, Microsoft now has only 3 big bets.  These are: More personal computing, Productivity & Business Processes and the Intelligent Cloud.  There are also 3 platforms - Windows, Office 365, Cloud.

Mary takes the opportunity to call out some of the technologies and projects that Microsoft is currently working on.  She first mentions the “Microsoft Graph” which is a grand, unified API that allows access to all other API’s provided by Microsoft i.e. Office 365 API, Azure etc. Developers can use the Microsoft Graph to extend the functionality of Office 365 and its related applications, for example.

Mary mentions she loves codenames.   She says she found out about Project Kratos - which is new, as-yet-unannounced technology building on top of Office365 and Azure called "PowerApps".  Not much is known about Project Kratos as yet, however, it appears to be a loose set of micro-services allowing non-programmers to extend and enhance the functionality of Office365.  It sounds like a very interesting proposition for business power users.

Mary talks about the future for cloud, and something known as PaaS 2.0 (Platform as a Service) which is also called Azure Service Fabric.  This is essentially lots of pre-built micro-services that can be consumed by developers.  Mary then quickly discusses one of her favourite project codenames from the past, “Red Dog”.  She says it was the codename for what eventually became Azure.  She says the codename originally came from some of the team members who were aware of a local strip club called the “Pink Poodle”, and so “Red Dog” was born!

Next, Mary goes on to talk about Bing.  She says that Bing is not just a search engine but is actually a whole developer platform as there are quite a lot of Bing related API’s. Bing has been around for quite some time, however, as a developer platform, it never really took off.  Mary says that Microsoft is now giving the Bing platform another “push”.  She mentions Project Satori, which is an “entity engine” and allows Bing and newer technology such as Cortana to better understand the web and search (i.e. a distributed knowledge graph).

IMG_20151118_140514 Mary then proceeds to mention that Microsoft has a team known as the "deep tech team" within the Developer Division.  She says how their job is to go out to companies that may have difficult technology problems that require solutions and to help those companies solve the problems.  Interestingly, the team are free to solve those problems using non-Microsoft technology as well as Microsoft technologies – whatever is the best solution to the problem.  The team will even help companies who are already committed to non-Microsoft technologies (i.e. pure Linux “shops” or pure Apple shops).  She says they have a series of videos on YouTube and Channel 9 known as the “Decoded” series, and that these videos are well worth checking out.

Mary then talks about another project, codenamed “Red Stone”.  This is the codename for what is effectively Windows 11, but which will be released as a significant update to Windows 10 (similar to Threshold2, however Red Stone is predicted to be 2 or 3 updates on from Threshold2).  She also talks about a few rumours within Microsoft.  One is that Microsoft may produce a Surface Phone, whilst the other is that Microsoft, if Windows Phone doesn’t gain significantly more market share, may switch their mobile phone operating systems to Android!

Finally, Mary talks about another imminent new technology from Microsoft called “GigJam”.  It’s billed as “a blank canvas you can fill with information and actions from your business systems.”  Mary says it’s one of those technologies that’s very difficult to explain, but once you’ve seen and used it, it very impressive.  Another one to watch!

After Mary’s session, there was a final coffee break after which was the last session of the day.  This session was Martin Beeby’sMy Little Edge Case And IoT.   Martin had created something called "Edge Case", which was built to help him solve one of his own business problems that he has as a developer evangelist.  He needed a unique and interesting "call to action" from the events that he attends.  Edge Case is a sort of arcade cabinet sized device that allows users to enter a URL that would be sent to Microsoft’s SiteScan website in order to test the rendering of that URL.  The device is a steampunk style machine complete with old fashioned typewriter keyboard for input, old pixelated LCD displays and valve based lightbulbs for output and even a smoke machine! 

Martin outsourced the building of the machine to a specialist company.  He mentions the company name and their Italian domain, we wemakeawesomesh.it which raises a few laughs in the audience.  Martin talks about how, after the full machine was built, he wanted to create a "micro" edge case, essentially a miniaturized version of the real thing, running on a single Raspberry Pi and made such that it could fit inside an old orange juice carton!.  Martin mentions that he’s placed the code for his small IoT (Internet of things) device on Github.

Martin demos the final micro edge case on stage.  Firstly, he asks the audience to send an SMS message using their phones to a specific phone number which he puts up on the big screen.  He asks that the SMS message simply contain a URL in the text.  Next, Martin uses his mini device to connect to the internet and access an API provided by Twilio in order to retrieve the SMS messages, one at a time, previously sent by the audience members.  The little device takes each URL and displays it to a small LCD screen built into the front of the micro edge case.  Martin reads out those URL’s and after a slight delay whilst the device sends the URL to the SiteScan service, Martin finally tells us how those URL’s have been rated by SiteScan, again, displayed on the small LCD screen of the micro edge case.

After Martin’s session was over, we were at the end of the day.  There was a further session later in the evening whereby Mary J. Foley was recording her Windows Weekly podcast live from the event; however, I had to leave to catch my train back home.  Stacked 2015 was another great event in the IT conference calendar, and here’s hoping the event will return again in 2016!

SQLBits 2016 In Review

$
0
0

IMG_20160507_072440On 7th May 2016 in Liverpool, the 15th annual SQLBits event took place in the new Liverpool Exhibition Centre.  The event had actually been running since Wednesday 4th, however, as with all other SQLBits events, the Saturday is a free, community day.

This particular SQLBits was rather special, as Microsoft had selected the event as the UK launch event for SQL Server 2016.  As such the entire conference had a very large Microsoft presence.

Since the event was in my home town, I didn’t have too far to travel to get to the venue.  That said, I did have to set my alarm for 6am (a full 45 minutes earlier than I usually do on a working weekday!) to ensure I could get the two different trains required to get me to the venue in good time.  The Saturday day is jam packed with content and as such, the event opened at the eye-watering time of 7:30am!

IMG_20160507_072440After arriving at the venue just as it was opening at 7:30am, I heading straight to the registration booth to confirm my registration and collect my conference lanyard.  Once collected, it was time to head into the main hall.  The theme for this years SQLBits was “SQLBits in Space” so the entire hall had the various rooms where the sessions would take place as giant inflatable white domes.  In between the domes and around the main hall there was plenty of space and sci-fi themed objects.

After a short while, the venue staff started to wheel out the morning refreshments of tea & coffee, shortly followed by the obligatory bacon, sausage and egg sandwiches!

After enjoying the delicious breakfast, it was soon time to head off the the relevant “dome” for the first session of the day.  The SQLBits Saturday event had 9 different tracks, so choosing what talk to attend was difficult and there was always bound to be clashes of interesting content throughout the day.  For the first session, I decided to attend Aaron Bertrand’sT-SQL: Bad Habits and Best Practices.

Aaron’s talk is all about the various bad habits that we can sometimes pick up when writing T-SQL code and also the myths that have built up around certain approaches to achieving specific things with T-SQL.  Aaron starts by stating that we should ensure that we don’t make blind assumptions about anything in SQL Server.  We can’t always say that a seek is better than a scan (or vice-versa) or that a clustered index is better than a non-clustered one.  It always depends.  The first big myth we encounter is that it’s often stated that using SELECT * when retrieving all columns from a database is bad practice (instead of naming all columns individually).  This can be bad practice as we don’t know exactly what columns we’ll be getting – e.g. future added columns will be returned in the query, however, it’s often stated that another reason it’s bad practice is due to SQL Server having to look up the database meta data to figure out the column names.  The reality is that SQL Server will do this anyway, even with named columns!

Next, Aaron shows us a little tip using SQL Server Management Studio.  It’s something that many audience members already knew, bit it was new to me. He showed how you can drag-and-drop the “Columns” node from the left-hand treeview into a query window and it will add a comma-separated list of all of the tables columns to the query text!

Aaron continues by warning us about omitting explicit lengths from varchar/nvarchar data types.  Without specifying explicit lengths, varchars can very easily be truncated to a single character as this simple T-SQL shows:

DECLARE @x VARCHAR = 'testing'; 
SELECT [myCol] = @x;

We’re told that we should always use the correct data types for our data!  This may seem obvious, but many times we see people storing dates as varchars (strings) simply to ensure they can preserve the exact formatting that they’re using.  This is a presentation concern, though, and doing this means we lose the ability to perform correct sorting and date arithmetic on the values.  Also, avoid using datatype such as MONEY simply because it sounds appropriate.  MONEY is a particularly bad example and should always be replaced with decimal

Aaron reminds us to always explicitly use a schema prefix when referencing tables and SQL Server objects within our queries (i.e. Use [dbo].[TableName] rather than just [TableName]).  Doing this ensure that, if two different users of our query have different default schemas, there won’t be any strange potential side-effects to our query.

We’re reminded not to abuse the ORDER BY clause.  Using ORDER BY with an Ordinal column number after it can easily break if columns are added, removed or their order in the schema altered.  Be aware of the myth that tables have a “natural order”, they don’t.  Omitting an ORDER BY clause may appear to order the data the same way each time, however, this can easily change if additional indexes are added to the table.

We should always use the SET NOCOUNT ON directive as this cut down on noisy chatter in our application’s communication with SQL Server, but make sure you always test this first.  Applications built using older technologies, such as the original ADO from the Classic ASP era can be reliant upon the additional count message being returned when NOCOUNT is off.

Next, Aaron highlights the cost of poorly written date / range queries.  He tells us that we shouldn’t use non-sargable expressions on a column – for example, if we use a WHERE clause which does something like WHERE YEAR([DateCoulmn]) = 2016, SQL Server will not be able to utilise any indexes that may exist on that column and will have to scan the entire table to compute the YEAR() function for the date column in question – a very expensive operation.  We’re told not use use the BETWEEN keyword as it’s imprecise – does BETWEEN include the boundary conditions or only everything between them?  It’s far better to explicitly use a greater than and less than clause for date ranges – e.g.  WHERE [OrderDate] > ‘1 Feb 2016’ AND [OrderDate] < '1 March 2016'. This ensures we’re not incorrectly including outlying boundary values (i.e. midnight on 28th Feb which is actually 1st March!).  Regarding dates, we should also be aware of date format strings.  Formatting a date with many date format strings can give entirely different values for different languages. The only two “safe” format strings which work the same across all languages are YYYYMMDD and the full ISO 8601 Date format string, “YYYY-MM-DDTHH:MM:SS”.

Aaron continues by reminding us to use the MERGE statement wisely.  We must remember that it effectively turns two statements into one, but this can potentially mess with triggers, especially if they rely on @@ROWCOUNT.  Next up is cursors.  We shouldn’t default to using a cursor if we can help it.  Sometimes, it can be difficult to think in set-based terms to avoid the cursor, but it’s worth the investment of time to see if some computation can be performed in a set-based way.  If you must use a cursor, it’s almost always best to apply the LOCAL FAST_FORWARD qualifier on our cursor definition as the vast majority of cursors we’ll use are “firehose” cursors (i.e. we iterate over each row of data once from start to end in a forward-only manner).  Remember that applying no options to the cursor definition effectively means the cursor is defined with the default options, which are rather “heavy-handed” and not always the most performant.

We’re reminded that we should always use sp_executesql when executing dynamic SQL rather than using the EXEC() statement.  sp_executesql allows the use of strongly-typed parameters (although unfortunately not for dynamic table or column names) which reduces the chances of SQL injection.  It’s not complete protection against injection attacks, but it’s better than nothing.  We’re also reminded not use to CASE or COALESCE in sub-queries.  COALESCE turns into a CASE statement within the query plan which means SQL Server will effectively evaluate the inner query twice.  Aaron asks that we remember to use semi-colons to separate our SQL statements.  It protects against future edits to the query/code and ensures atomic statements continue to operate in that way.

Aaron says that we should not abuse the COUNT() function.  We very often write code such as:

IF (SELECT COUNT(*) FROM [SomeTable]) > 0 THEN …..

when it’s really much more efficient to write:

IF EXISTS (SELECT 1 FROM [SomeTable]) THEN….

We don’t really need the count in the first query so there’s no reason to use it.  Moreover, if you do really need a table count, it’s much better to query the sys.partitions table to get the count:

-- Do this:
SELECT SUM(rows) FROM sys.partitions where index_id IN (0,1)
AND object_id = (SELECT object_id FROM sys.tables WHERE name = 'Addresses')
-- Instead of this:
SELECT COUNT(*) FROM Addresses

Aaron’s final two points are to ensure we don’t overuse the NOLOCK statement.  It’s a magic “go-faster stripes” turbo button for your query but it will produce inaccurate results.  This is fine if, for example, you only  need a “ballpark” row count, however, it’s almost always better to use a scope-levelled READ COMMITED SNAPSHOT isolation level for your query instead.  This must be tested, though, as this can place a heavy load on the tempdb.  Finally, we should remember to always wrap every query we do with a BEGIN TRANSACTION and a COMMIT/ROLLBACK transaction.  Remember – SQL Server doesn’t have an “undo” button!  And it’s perfectly fine to simply BEGIN a transaction when writing ad-hoc queries in SQL Server Management Studio, even if we don’t explicitly close it straight away.  The transaction will remain so long as the connection remains open, so we can always manually perform the commit or the rollback at a slightly later point in time.

And with that, Aaron’s session was over.  An excellent and informative start to the day.

IMG_20160507_073553After a coffee break, during which time there was some left over breakfast bacon sandwiches available for those people who fancied a second breakfast, it was time to head off to the next session.  For this one, I’d chosen something a little leftfield.  This wasn’t a session directly based upon technology, but rather was a session based upon employment within the field of technology.  This was Alex Whittle’sPermy, Contractor Or Freelance.

Alex’s session was focused on how we might get employed within the technology sector, the various options open to us in gaining meaningful employment and the pros and cons associated with each approach.

Alex starts his talk by introducing himself and talking us through his own career history so far.  He started as an employee developer, then a team lead and then director of software before branching out on his own to become a contractor.  From there, he became a freelancer and finally started his own consultancy company.

Alex talks about an employer’s expectations for the various types of working relationship.  For permanent employees, the focus is very much on your overall personality, attitude and ability to learn.  Employers are making a long term bet with a permanent employee.  For contractors, it’s your existing experience in a given technology or specific industry that will appeal most to the client.  They’re looking for someone who can deliver without needing any training “on-the-job” although you’ll get time to “figure it out” whilst you’re there.  You’ll also have “tech-level” conversations with your client, so largely avoiding the politics that can come with a permanent role.  Finally, as a freelancer, you’ll be engaged because of your technical expertise and your ability to deliver quickly.  You’re expected to be a business  expert too and you’re engagement will revolve around “senior management/CxO” level conversations with the client.

Alex moves on to discuss the various ways of marketing yourself based upon the working relationship.  For permanent employees its recruitment agencies, LinkedIn and keeping your CV up to date.  You’re main marketing point is your stability so you’re CV needs to show a list of jobs with good lengths of tenure for each one.  One or two shorter tenures is acceptable, but you’ll need to be able to potentially explain it well to a prospective employer.  For contractors, it’s much the same avenues for marketing, recruitment agencies, LinkedIn and a good CV, but here the focus is quite different.  Firstly, a contractor’s CV can be much longer than a permanent employee’s CV, which is usually limited to 3 pages.  A contractors CV can be up to 4-6 pages long and should highlight relevant technical and industry experience as well as show contract extensions and renewals (although older roles should be in summary only).  For freelancers, it’s not really about your CV at all.  Clients are now not interesting in you per-say, they’re interested in your company.  This is where company reputation and you’re ability to really sell the company itself has the biggest impact.  For all working relationships, one of the biggest factors is networking.  Networking will lead to contacts, which will lead to roles.  Never underestimate the power of simply speaking to people!

We then move on to talk about cash flow in the various types of working relationship.  Alex states how for permanent employees, there’s long term stability, holiday and sickness pay and also a pension.  It’s the “safest” and lowest stress option.  For contractors, cash flow has medium term stability.  There’s no holiday or sickness pay and you’d need to pay for your own pension.  You need to build a good cash buffer of at least 6 months living expenses, but you can probably get started on the contracting road with only 1 or 2 months of cash buffer.  Finally, the freelance option is the least “safe” and has the least stability of cash flow.  It’s often very “spiky” and can range of short periods of good income interspersed with longer periods of little or no income.  For this reason, it’s essential to build a cash buffer of at least 12 months living expenses, although the quieter income periods can be mitigated by taking on short term contracts.

Alex shares details on the time when he had quit his permanent job to go contracting.  He says he sent around 20-30 CV’s to various contract job per week for the first 3 weeks but didn’t get a single interview.  A helpful recruiter eventually told him that it was probably largely to do with the layout of his CV.  This recruiter spent 30 minutes with him on the phone, helping him to reformat his CV after which he sent out another 10 CV to various contract roles and got almost 10 interviews as a result!

We continue by looking into differences in accounting structures between the various working types.  As a permanent employee, there’s nothing to worry about at all here, it’s all sorted for you as a PAYE employee.  As a contractor, you’ll send out invoices usually once a month, but since you’ll rarely have more than one client at a time, the invoicing requirements are fairly simple.  You will need to do real-time PAYE returns as you’ll be both a director and employee of your Ltd. company and you’ll need to perform year-end tax returns and quarterly VAT returns, however, you can use the flat-rate VAT scheme if it’s applicable to you.  This can boost your income as you charge your clients VAT at 20% but only have to pay 14.5% to HMRC!  As a freelancer, you’ll also be sending out invoices, however, you may have more than one client at a time so you may have multiple invoices per month thereby requiring better management of them (such software as Xero or Quickbooks can help here).  One useful tip that Alex shares at this point is that, as a freelancer, it can be very beneficial to join the Federation of Small Businesses (FSB) as they can help to pay for things like tax investigations, should you ever receive one.

Alex then talks about how you can, as an independent contractor, either operate as a sole-trader, work for an umbrella company, or can run your own Limited company.  Limited company is usually the best route to go down as Limited companies are entirely separate legal entities so you’re more protected personally (although not from things like malpractice), however, the previous tax efficiency of company dividends that used to be enjoyed by Ltd’s no longer applies due to the loophole in the law being closed.  As a sole trader, you are the company – the same legal entity, so you can’t be VAT registered and your not personally protected from liability.  When working for an umbrella company, you become a permanent employee of the umbrella company. They invoice on your behalf and pay your PAYE.  This affords you the same protection as any other employee and takes away some of the management of invoicing etc. however, this is probably the least cost efficient way of working since the umbrella company will take a cut of your earnings.

We then move onto the thorny issue of IR35. This is legislation that designed to catch contractors who are really operating more as “disguised employees”.  IR35 is constantly evolving and application by HMRC can be inconsistent.  The best ways to mitigate being “caught” inside of IR35 legislation are to perform tasks that an employee does not do.  For example, advertising your business differentiates you from an employee, ensuring your contracts have a “right of substitution” (whereby the actual worker/person performing the work can be changed), having multiple contracts at any one time – whilst sometimes difficult for a contractor to achieve - can greatly help, showing that you are taking on risk (especially financial risk) along with being able to show that you don’t receive any benefits from the engagement as an employee would do (for example, no sick pay).

Finally, Alex asks, “When should you change?”  He puts a number of questions forward that we’d each need to answer for ourselves.  Are you happy with your current way of working?  Understand the relative balance of income versus stress from the various working practices.  Define your goals regarding work/life balance.  Ask yourself why you would want to change, how do you stand to benefit?  Where do you live?  Be aware that very often, contracts may not be readily available in your area, and that you may need to travel considerable distance (or even stay away from home during the working week), and finally, Alex asks that you ask yourself, “Are you good enough?”.  Alex closes by re-stating the key takeaways.  Enjoy your job, figure out your goals, increase your profile, network, remember that change can be good – but only for the right reasons, and start now – don’t wait.

IMG_20160507_105119

After another coffee break following Alex’s session, it’s time for the next one.  This one was Lori Edwards’SQL Server Statistics – What are the chances?

Lori opens by asking “What are statistics?”.  Just as Indexes provide a “path” to find some data, usually based upon a single column, statistics contain information relating to the distribution of data within a column across the entire set of rows within the table.  Statistics are always created when you create an index, but you can create statistics without needing an index.

Statistics can help with predicates in your SQL Server queries.  Predicates are the conditions within your WHERE or ORDER BY clauses.  Statistics contain information about density, which refers to the number of unique values in the column along with cardinality which refers to the uniqueness of a given value.  There’s a number of different ways to create statistics, you can simply add an index, you can use AUTO CREATE STATISTICS and CREATE STATISTICS directives as well as using a system stored procedure, sp_createstats.  If you’re querying on a column, statistics for that column will be automatically created for you if they don’t already exist, however, if you anticipate heavy querying utilising a given column, it’s best to ensure that statistics are created ahead of time.

Statistics are quite small and don’t take up as much space as indexes.  You can view statistics by running the sp_helpstats system stored procedure or you can query the sys.stats system table or even the sys.dm_db_stats table.  The best way of examining statistics, though, is to use the database console command, DBCC SHOW_STATISTICS.  When viewing statistics, low density values indicate a low level of uniqueness.  Statistics histograms show a lot of data, RANGE_HI_KEY is the highest key value, whilst RANGE_ROWS indicates how many rows there are between the HI_KEYS in different column values.

The SQL Server Query Optimizer uses statistics heavily to generate the optimized query plan.  Note, though, that optimized query plans are necessarily optimal for every situation, they’re the most optimal general purpose plans.  It’s purpose is to come up with a good plan, fast, and statistics are necessary for this to be able to happen.  To make the most of the cardinality estimates from statistics, it’s best ensure you use parameters to queries and stored procedures, use temp tables where necessary and keep column orders consistent.  Table variables and table-valued parameters can negatively affect cardinality.  Whether the query optimizer selects a serial or parallel plan can be affected by cardinality, as can the choice to use an index seek versus an index scan.  Join algorithms (i.e. hash match, nested loops etc.) can also be affected.

From here, the query optimizer will decide how much memory it thinks it needs for a given plan, so memory grants are important.  Memory grants are effectively the cost of the operation multiplied by the number of rows that the operation is performed against, therefore, it’s important for the query optimizer to have accurate row count data from the statistics. 

2016-06-07 21_42_25-I5xLz.png (766×432)One handy tip that Lori shares is in interpreting some of the data from the “yellow pop-up” box when hovering over certain parts of a query plan in SQL Server Management Studio.  She states how the “Estimated Number Of Rows” is what the table’s statistics say there are, whilst the “Actual Number Of Rows” are what the query plan actually encountered within the table.  If there’s a big discrepancy between these values, you’ll probably need to update the statistics for the table!

Statistics are automatically updating by SQL Server, although, they’re only updated after a certain amount of data has been added or updated within the table.  You can manually update statistics yourself by calling the sp_updatestats system stored procedure.

By default, tables inside a database will have AUTO UPDATE STATISTICS switched on, which is what causes the statistics to be updated automatically by SQL Server occasionally – usually after around 500 rows or 20% of the size of the table have been added/modified.  It’s usually best to leave this turned on, however, if you’re dealing with a table that contains a very large number of rows and has either many new rows added or many rows modified, it may be better to turn off the automatic updating of statistics and perform the updates manually after either a specific number of modifications or at certain appropriate times.

Finally, it’s important to remember that whenever statistics are updated or recomputed, any execution plans built on those statistics that were previously cached will be invalidated.  They’ll need to be recompiled and re-cached.

After Lori’s session, there’s another quick coffee break, and then it’s on to the next session.  This one was Mark Broadbent’sLock, Block & Two Smoking Barrels.  Mark’s session focused on SQL Server locks.  Different types of locks, how they’re acquired and how to best design our queries and applications to ensure we don’t lock data for any longer than we need to.

Mark first talks about SQL Server’s transactions.  He explains that transactions are not committed to the the transaction logs immediately.  They are processed through in-memory buffers first before being flushed to disk.  Moreover, the logs need to grow to a certain size before they get flushed to disk so there’s always a possibility of executing a COMMIT TRANSACTION statement yet the transaction isn’t visible within the transaction log until sometime later.  The transaction being available in the transaction log is the D in ACID– Durability, but Mark highlights that it’s really delayed durability.

IMG_20160507_125814Next, Mark talks about concurrency versus correctness. He reminds us of some of the laws of concurrency control.  The first is that concurrent execution should not cause application programs to malfunction. The second is that concurrent execution should not have lower throughput or higher response times than serial execution.  To balance concurrency and correctness, SQL Server uses isolation, and there are numerous isolation levels available available to us, all of which offer differing levels of concurrency versus correctness.

Mark continues by stating that SQL Server attempts to perform our queries in as serial a manner as possible, and it uses a technique called transaction interleaving in order to achieve this between multiple concurrent and independent transactions.  Isolation levels attempt to solve the interleaving dependency problems.  They can’t completely cure them, but they can reduce the issues caused by interleaving dependencies.  Isolation levels can be set at the statement, transaction or session levels.  There are 4 types defined by the ANSI standards, but SQL Server 2005 (and above) offer a fifth level.  It’s important to remember that not all isolation levels can be used everywhere, for example, the FILESTREAM data type is limited in the isolation levels that it supports.

We’re told how SQL Server’s locks are two-phased and are considered so if every LOCK is succeeded by an UNLOCK.  SQL Server has different levels of locks, and they can exist at a various levels of granularity from row locks, to page locks all the way up to table locks.  When SQL Server has to examine existing locks in order to acquire a new or additional lock, it will only ever compare locks on the same resource.  This means that row locks are only ever compared to other row locks, page locks compared to other page locks and table locks to other table locks.  They’re all separate.  That said, SQL Server will automatically perform lock escalation when certain conditions occur, so when SQL Server has acquired more than 5000 other locks of either row or page type, it will escalate those locks to a single table level lock.  Table locks are the least granular kind of lock and a very bad for performance and concurrency within SQL Server – basically the one query that holds the table level lock prevents any other query from accessing that table.  For this reason, it’s important to ensure that our queries are written in such a way as to minimize the locks that they need, and to ensure that when they do require locks that those locks as granular as can be.  Update locks will allow multiple updates against the same table and/or rows.  They’re compatible with shared locks but not other update locks or exclusive locks so it’s worth bearing in mind how many concurrent writes we attempt to make to our data.

Mark continues to show us some sample query code that demonstrates how some simple looking queries can cause concurrency problems and can result in lost updates to our data.  For example, Mark shows us the following query:

BEGIN TRAN T1
SELECT @newquantity = quantity FROM basket
SET @newquantity = @newquantity + 1
UPDATE some_other_table SET quantity = @newquantity
COMMIT TRAN T1

The above query can fail badly, with the required UPDATE being lost if multiple running transaction perform this query at the same time.  This is due to transaction interleaving.  This results in two SELECTs which happen simultaneously and acquire the quantity value, but the two UPDATEs get performed in interleaved transactions which means that the second UPDATE that runs is using stale data to update, effectively “overwriting” the first UPDATE (so the final newquantity value is one less than it should be).  The solution to this problem is to perform the quantity incrementing in-line within the UPDATE statement itself:

BEGIN TRAN T1
UPDATE some_other_table SET quantity = t2.newquantity FROM (SELECT quantity + 1 FROM basket) t2
COMMIT TRAN T1

Reducing the number of statements needed to perform some given function on our data is always the best approach.  It means our queries are being as granular as they can be, proving us with better atomic isolation and thereby reducing the necessity to interleave transactions.

IMG_20160507_133930After Mark’s session was over, it was time for lunch.  Lunch at the SQLBits conferences in previous years has always been excellent with a number of choices of hot, cooked food being available and this year was no different.  There was a choice of 3 meals, cottage pie with potato wedges, Moroccan chicken with couscous or a vegetarian option (I’m not quite sure what that was, unfortunately), each of which could be finished off with one of a wide selection of cakes and desserts!

IMG_20160507_123500I elected to go for the Moroccan chicken, which was delicious, and plumped for a very nice raspberry ripple creamy yoghurt.  An excellent lunch, as ever!

During lunch, I managed to catch up with a few old friends and colleagues who had also attended the conference, as well as talking to a few newly made acquaintances whilst wandering around the conference floor and the various sponsors stands.

After a good wander around, during which I was able to acquire ever more swag from the sponsors, it was soon time for the afternoon’s sessions.  There were only two more sessions left within the day, it now being around 14:30 after the late lunch hour was over, I headed off to find the correct “dome” for the first of the afternoon’s sessions, Erland Sommarskog’s Dynamic Search Conditions.

Erland’s talk will highlight the best approaches when dealing with dynamic WHERE and ORDER BY clauses in SQL Server queries, something that I’m sure most developers have had to deal with at some time or another.  For this talk, Erland will use his own Northgale database, which is the same schema as Microsoft’s old Northwind database, but with a huge amount of additional data added to it!

Erland first starts off by warning us about filtered indexes.  These are indexes that themselves have a WHERE condition attached to them (i.e. WHERE value <> [somevalue]) as these tend not to play very well with dynamic queries.  Erland continues by talking about how SQL Server will deal with parameters to queries.  It will perform parameter “sniffing” to determine how best to optimize a static query by closely examining the actual parameters we’re supplying.  Erland shows us both a good and bad example:  WHERE xxx = ISNULL(@xxx,xxx) versus WHERE xxx = (xxx = @xxx OR xxx IS NULL).  He explains how the intended query will fail if you use ISNULL in this situation.  We’re told how the SQL Server query optimizer doesn’t look at the stored procedure itself, so it really has no way of knowing if any parameters we pass in to it are altered or modified in any way by the stored procedures code.  For this reason, SQL Server must generate a query plan that is optimized for any and all possible values.  This is likely to be somewhat suboptimal for most of the parameters we’re likely to supply.  It’s for this reason that the execution plan can show things like an index scan against an index on a “FromDate” datetime column even if that parameter is not being passed to the stored procedure.  When we’re supplying only a subset of parameters for a stored procedure with many optional parameters, it’s often best to use the OPTION RECOMPILE statement to force a recompilation of the query every time it’s called.  This way, the execution plan is regenerated based upon the exact parameters in use for that call.  It’s important to note, however, that recompiling queries is an expensive operation, so it’s best to measure exactly how often you’ll need to perform such queries.  If you’re calling this query very frequently, you may well get the best performance from using purely dynamic SQL.

Erland then moves on to discuss dynamically ordering data.  He states that the CASE statement inside the ORDER BY clause is the best way to achieve this, for example: ORDER BY CASE @sortcolumn WHEN ‘OrderID’ THEN [Id] END, CASE @sortcolumn = ‘OrderDate’ THEN [Date] END…..etc.  This is a great way to achieve sorting my dynamic columns, however, there’s a gotcha with this method and that is that you have to be very careful of datatype differences between the columns in the different case clauses as this can often lead to error.

Next, we look at the permissions required in order to use such dynamic SQL and Erland says that it’s important to remember that any user who wishes to run such a dynamic query will require permissions to access the underlying table(s) upon which the dynamic query is based.  This differs from (say) a stored procedure where the user only need permissions to the stored procedure and not necessarily the underlying table upon which the stored procedure is based.  One trick that can be used to gain somewhat the best of both of these approaches is to use the sp_executesql system stored procedure.  Using this will create a nameless stored procedure from your query, it will cache it and execute it.  The stored cache can then be re-used on subsequent calls to the query with the nameless stored procedure being identified based upon a hash of the the query content itself.

Another good point that Erland mentions is to ensure that all SQL server objects (tables, functions etc.) referenced within a dynamic query should always be prefixed with the full schema name and not just referenced by the object name (i.e. use [dbo].[SomeTable] rather than [SomeTable]).  This is important as different users who run your dynamic SQL code could be using different default schemas – if they are and you haven’t specified the schema explicitly, the query will fail.

Erland also mentions that one very handy tip with dynamic queries is to always include a @debug input parameter of datatype bit, that can have a default setting of 0 (off).  It’ll allow you to always specify this parameter and pass in a value of 1 (on) to ensure that code such as IF @debug PRINT @sql will be run allowing you to output the actual T-SQL query generated by the dynamic code.  Erland says that you will need this eventually, so it’s always best to build it in from the start.

When building up your dynamic WHERE clause, one tricky condition is to know whether to add an AND at the beginning of the condition if you’re adding the 2nd or higher condition (the first condition of the WHERE clause won’t need the AND to be prepended of course).  One simple way around this is to make it so that all of the dynamically added WHERE clauses are always the 2nd or higher numbered condition by statically creating the first WHERE clause condition in your code as something benign such as “WHERE 1 = 1”.  This, of course, matches all records and all subsequently added WHERE clauses can always be prefixed with an AND, for example, “IF @CustomerPostCode THEN @sql += “ AND Postcode LIKE …..”, also it’s important to always add parameters into the dynamic SQL rather than concatenating values (i.e. avoid doing @sql += ‘ AND OrderId = ‘” + @OrderId + “’) as this will mess with the query optimizer and your generated queries will be less efficient overall as a result.  Moreover, raw concatenation of values can be a vector for SQL injection attacks.  For this same reason, you should always translate the values that you’ll use for WHERE and ORDER BY clauses that are passed into your stored procedure.  Translate the passed parameter value to a specific hard-coded value that you explicitly control.  Don’t just use the passed in parameter value directly.

Occasionally, it can be a useful optimization to inline some WHERE clause values in order to force a whole new query plan to be cached.  This is useful in the scenario when, for example, you're querying by order city and 60% of all orders are in same city.  You can inline that one city value to have a cached plan just for that city and a different single cached plan for all other cities.

Finally, for complex grouping, aggregation and the dynamic selection of the columns returned from the query, Erland says is often easiest to and more robust to construct these kind of queries in the client application code rather than in a dynamic SQL producing stored procedure.  One caveat around this is to ensure that you perform the entirety of your query client-side (or entirely server-side if you must) – don’t try to mix and match by performing some client-side and some server-side.

IMG_20160507_154920And with this, Erland’s session on dynamic SQL search conditions is complete.  After yet another short coffee break, we’re ready for the final session of what has been a long, but information-packed day.  And for the final session, I decided to attend Simon D’Morias’“What is DevOps For Databases?”

Simon starts with explaining the term “DevOps” and reminds us that it’s the blurring of lines between the two traditionally separate disciplines of development and operations.  DevOps means that developers are far closer to the “operations” side of applications which frequently means getting involved with deployments, infrastructure and a continuous delivery process.  DevOps is about automation of application integration and deployment, provably and reliably.

Simon shows the three pillars upon which a successful DevOps process is built.  Develop, Deploy & Measure.  We develop some software, deploy it and the then measure the performance and reliability of the deployment.   From this measurement we can better plan and can thus feed this back into the next iteration of the cycle.  We’re told that to make these iterations work successfully, we need to keep changes small. From small changes, rather than larger ones, we can keep deployment simple and fast.  It allows us to gather frequent feedback on the process and allows continuous improvement of the deployment process itself.  With the teams behind the software (development, operations etc.) being more integrated, there’s a greater spread of knowledge about the software itself, the changes to the software in a given development/deployment cycle which improves early feedback.  Automation of these systems also ensures that the deployment is made easier and thus also contributes to better and earlier feedback.

When it comes to databases, DBA’s and other database professionals are frequently nervous about automating any changes to production databases, however, by keeping changes small and to a minimum within a given deployment cycle, and by having a continuously improving robust process for performing that deployment, we can ensure that each change is less risky than if we performed a single large change or upgrade to the system.  Continuous deployments also allow for detecting failures fast, which is a good thing to have.  We don’t want failures caused by changes to take a long time before they surface and we’re made aware of them.  Failing fast allows easy rollback and reliability of the process enables automation which further reduces risk.  Naturally, monitoring plays a large part of this and a comprehensive monitoring infrastructure allows detection of issues and failures and allows improves in reliability over time which, again, further reduces risk.

Simon moves on to discuss the things that can break DevOps.  Unreliability is one major factor that can break a DevOps process as even something running at 95% reliability is no good.  That 5% failure rate will kill you.  Requiring approval within the deployment chain (i.e. some manual approval, governance or compliance process) will break continuity and is a potential bottleneck for a successful DevOps deployment iteration also.  A greater “distance” between the development, operations and other teams will impact their ability to be knowledgeable about the changes being made and deployed.  This will negatively impact the team’s ability to troubleshoot and issues in the process, hindering the improvement of reliability.

IMG_20160507_164439It can often be difficult to know where to start with moving to an automated and continuous DevOps process.  The first step to to ensure we can “build ourselves a pipeline to live” – this is a complete end-to-end automated continuous integration process.  There are numerous tools available to help with this and SQL Server itself assists in this regard with the ability to package SQL server objects into a DACPAC package.  Simon insists that attempting to proceed with only a partial implementation of this process will not work.  It’s an all or nothing endeavour. Automating deployments to development and test environments, but not to the production environment (as some more nervous people may be inclined to do) is like building only half and bridge across a chasm!  Half a bridge is no bridge at all!

Simon concludes by showing us a quick demo of a simple continuous deployment process using Visual Studio to make some local database changes, which are committed to version control using Git and then pushed to Visual Studio Team Services (previously known as Visual Studio Online) which performs the “build” of the database objects and packages this into a DACPAC package.  This package is then automatically pushed to an Azure DB for deployment.

Finally, Simon suggests that one of the the best ways to ensure that our continuous deployment process is consistent and reliable is to ensure that there are minimal differences (ideally, no differences) between our various environments (development, test, staging, production etc.), and especially between our staging and production environments.

After Simon’s session was over, it was time for all of the conference attendees to gather in the main part of the exhibition hall and listen as one of the conference organisers read out those people who had won prizes by filling in forms and entering competitions run by each of the conference sponsors.  I didn’t win a prize, and actually, had entered very few competitions having been far too busy either attending the many sessions or drinking copious amounts of coffee in between them!  Once the prizes were all dished out, it was time for yet another fantastic SQLBits conference to sadly end.  It had been a long, but fantastic day at another superbly organised and run SQLBits conference.  Here’s hoping next year’s conference is even better!

Beware NuGet’s filename encoding!

$
0
0

The other day, I was troubleshooting some issues that had occurred on a deployment of some code to a test server.  Large parts of the application were simply not working after deployment, however, the (apparently) same set of code and files worked just fine on my local development machine.

After much digging, the problem was finally discovered as being how NuGet handles and packages files with non-standard characters in the filename.

It seems that NuGet will ASCII-encode certain characters within filename, such as spaces, @ symbols etc.  This is usually not a problem as NuGet itself will correctly decode the filenames again when extracting (or installing) the package, so for example, a file named:

READ ME.txt

within your solution will be encoded inside the .nupkg file as:

READ%20ME.txt

And once installed / extracted again using NuGet will get it’s original filename back.  However, there’s a big caveat around this.  We’re told that NuGet’s nupkg files are “just zip files” and that simply renaming the file to have a .zip extension rather than a .nupkg extension allows the file to be opened using 7-Zip or any other zip archive tool.  This is all fine, except that if you extract the contents of a nupkg file using an archiving utility like 7-zip, any encoded filenames will retain their encoding and will not be renamed back to the original, correct, filename!

It turns out that my deployment included some manual steps which included the manual extraction of a nupkg file using 7-Zip.  It also turns out that my nupkg package contained files with @ symbols and spaces in some of the filenames.  These files were critical to the functioning of the application, and when manually extracting them from the package, the filenames were left in an encoded format meaning the application could not load the files as it was looking for the files with their correct (non-encoded) filenames.

I’m now an MCSD in Application Lifecycle Management!

$
0
0

MCSD_2013(rgb)_1509Well, after previously saying that I’d would give the pursuit of further certifications a bit of a rest, I’ve gone and acquired yet another Microsoft Certification.  This one is Microsoft Certified Solutions Developer – Application Lifecycle Management.

It all started around the beginning of January this year when Microsoft sent out an email with a very special offer.  Register via Microsoft’s Virtual Academy and you would be sent a 3-for-1 voucher for selected Microsoft exams.  Since the three exams required to achieve a Microsoft Certified Solutions Developer – Application Lifecycle Management exams were included within this offer, I decided to go for it.  I’d pay for only the first exam and get the other two for free!

So, having acquired my voucher code, I proceeded to book myself in for the first of the 3 exams.  “Administering Visual Studio Team  Foundation Server 2012” was the first exam which I’d scheduled for the beginning of February.  Although I’d had some previous experience of setting up, configuring and administrating Team Foundation Server, that was with the 2010 version of the product.  I realised I needed to both refresh and update by skills.  Working on a local copy of TFS 2012 and following along with the “Applying ALM with Visual Studio 2012 Jumpstart” course on Microsoft’s Virtual Academy site, as well as studying with the excellent book, “Professional Scrum Development with Microsoft Visual Studio 2012” that is recommended as a companion/study guide for the MCSD ALM exams, I quickly got to work.

I sat and passed the first exam in early February this year.  Feeling energised by this, I quickly returned to the Prometric website to book the second of the three exams, “Software Testing with Visual Studio 2012”, which was scheduled for March of this year.  I’d mistakenly thought this was all about unit testing within Visual Studio, and whilst some of that was included in this course, it was really all about Visual Studio’s “Test Manager” product.  The aforementioned Virtual Academy course and the book covered all of the this course’s content, however, so continued study with those resources along with my own personal tinkering helped me tremendously.  When the time came I sat the exam and amazingly, passed with full marks!

So, with 2 exams down and only 1 to go, I decided to plough on and scheduled my third and final exam for late in April.  This final exam was “Delivering Continuous Value with Visual Studio 2012 Application Lifecycle Management” and was perhaps the most abstract of all of the exams, focusing on agility, project management and best practices around the “softer” side of software development.  Continued study with the aforementioned resources was still helpful, however, when the time came to sit the exam, I admit that I felt somewhat underprepared for this one.  But sit the exam I did, and whilst I ended up with my lowest score from all three of the exams, I still managed to score enough to pass quite comfortably.

So, with all three exams sat and passed, I was awarded the “Microsoft Certified Solution Developer – Application Lifecycle Management” certification.  I’ll definitely slow down with my quest for further certifications now….Well, unless Microsoft send me another tempting email with a very “special” offer included!

Viewing all 103 articles
Browse latest View live