ASF 017: Alex Whittles interview

ASF 017: Alex Whittles interview

Introduction

Alex Whittles is the owner and principle consultant at Purple Frog, a SQL Server Business Intelligence consultancy in the UK with multinational clients in a variety of sectors. He specialises in all aspects of dimensional data modelling, data warehousing, ETL and cubes using the SQL Server stack.
Alex has an MSc (Master of Science) in Business Intelligence, is a chartered engineer, and a member of Mensa. Community leadership includes being on the SQLBits committee, Director of SQL Relay, founder and leader of the Birmingham Data Platform user group.
Alex is also a regular speaker at many SQL Server events around the world including SQL Relay, SQLBits, SQL Saturdays, 24 HOP and the PASS Summit.

This talk has taken place during Data Relay (formerly SQL Relay) conference in Reading (UK), on 11th October 2018 (Thursday).
Interviewers: Kamil Nowinski.

Which unusual mean of transportation Alex use from time to time? Why monitoring of data quality does matter and what is the very efficient contrary of SCD in SSIS in BI loading process? Which skill is important if you want to jump into the IT market?
Find out the answers on these questions and much, much more.

Audio version

 

Don’t you have time to read? You can listen this as a podcast! Wherever you are, whatever you use.
Just use the player directly from this site (above), find it on Spreaker, Apple Podcasts, Spotify (new!) or simply download MP3.

Enjoy!

Transcript

Kamil Nowinski: Hi Alex. Thanks for being a guest of this podcast and accepting my invitation.
Could you, at the beginning, tell me what your name is and where you live?

Alex Whittles: Good afternoon! Thanks for inviting me. My name’s Alex Whittles and I live in Telford in Shropshire.

KN: I know that you have your own consultancy company. What are you doing for a living exactly?

AW: So I run Purple Frog, which started 12 years ago now and it’s a Microsoft BI consultancy. Initially, we started focusing on data warehousing, ETL, data modeling and cubes with a special focus on cubes. But over the years, as the Microsoft data platform has evolved and expanded, that’s grown in breadth into Power BI, Databricks, Data Lake, machine learning, quite a lot different topics now. But still focused around the concept of taking a vast amount of data from around organizations and putting it into an easy, simple way of accessing useful information and spreading it around the business. And the tools evolve and change over the years, but the need of a company to easily access its own data in a reliable, consistent way – that still the same as it always has been and that’s what we do as a business.

KN: Do you build any framework for example to help yourself achieve the goals faster?

AW: So we have a number of different methods of what we do, different approaches. One is pure consultancy, where we go into a business and just talk to them about opportunities, what they can do, how they can use technology to solve a problem. Advise them, mentor their staff, provide training advice and hand-holding to help them achieve their goals and put them in the right direction. Then we have a development arm where we build complete end-to-end solutions and systems internally and yeah, we’ve spent a lot of the last 12 years building up quite a comprehensive framework of code that helps us build much more highly scalable robust solutions. So if you look at a standard data warehouse and ETL solution, probably about 80% of the code is reused project to project to project. We’re talking about restart ability, robustness, logging, error handling, sequencing control, management – all this kind of stuff.

KN: And basically, it doesn’t matter if you use the old-fashioned ETL or the new, unlike in Azure Data Factory stuff?

AW: Exactly. The concept’s the same. The actual metadata, the information you need and the approach to it is still very, very similar. Okay, we move from ETL to ELT or ETLT, various different combinations, but the concept is there, and the underlying functionality is still very, very similar. So we’ve spent a lot of time and resources building up a large framework and code automation set. That’s when we define a particular user or particular customer’s requirements and work out what they need, the complex job is doing the requirements gathering, the source to target mapping, where the data coming from. Once we identify that metadata, then the process of actually building the code should be very straightforward and so our framework automates a lot of that code build, which means it’s very standardised reusable code. Very, very consistent, so any of our developers can pick up any code than any other developer’s done and know exactly how it works. But also, our customers get a very consistent approach to the code that’s being developed. And that allows us to build much bigger solutions. We’re actually quite a small team. We’re a team of six people, but we’ve done multinational huge corporations products for them that you would think would take teams of fifty or a hundred. And because of a lot of this code automation of the underlying framework, we can do very, very large projects with actually quite a small team.

KN: Exactly, that was my next question, that you and your company basically are the same. I’ve seen that you’re hiring people and so you mentioned that the size of your company is six people, yeah?

AW: Well, is actually five today, it’ll be six on Monday.

KN: Okay, great, congratulations, new member of the team.

AW: We’re growing as fast as we can. The main limitation we have is availability of staff and resources. So we’re always on the lookout for good people and that could be existing senior people that are already skilled in technology that can come in on a consultant level, or it could be junior staff that we can train up and we have a very strong ethos in the company of training, mentoring and growing skills with the pace of change of the technology in the Microsoft Data Platform world, it’s no good if you’re an expert in technology today, because it’s going to change tomorrow, so we have to keep evolving and learning new technology and probably 20% of our time in the office is spent exploring new technologies as a constant iterative learning process, and so even our experienced staff have to keep learning, have to keep playing with new technologies doing proof-of-concept projects. We’re pretty open to taking on anybody, no matter what your skill level, as long as you have the desire to learn, the ability to learn and as long as you enjoy working with this tech. If you don’t enjoy working with it, you’re not going to do a good project. We want people who really embrace and enjoy this kind of work and enjoy playing with data. And that takes a special kind of person.

KN: It’s amazing that you are answering my questions in advance basically. You have answered three of my questions.

AW: It’s all machine learning predictive models. We plan for it in advance.

KN: So yes, you’re still looking for new people to your company, that’s good. Different level of experience of them, yeah?

AW: Yeah, so the last few people we’ve taken on, the guy’s starting on Monday, it’s focusing around Power BI and Pyramid Analytics, but with an underlying core of data modelling, data warehousing and ETL, but focusing a lot on the actual presentation of that into customers. The previous chap who started, his main focus is on machine learning, so his background degree was in artificial intelligence machine learning models and he’s really pushing that side of the business for us. The guy before that is actually a C# developer, so even though we’re a business intelligence house, we are finding there’s a lot more coding coming into our world and so we’ve now got a C# person and we’re looking for another one actually as well.

KN: So C# helps you in SSIS scripts and also in U-SQL, yeah?

AW: Yeah, it’s not just C#. It’s C#, its Python, it’s R, there’s a lot of code creeping into our world these days.

KN: More and more languages.

AW: When you look at tools like Databricks, you’ve got Scala, you’ve got Python in there, you need to have a lot of languages.

KN: Maybe in the nearest future you will not need and know these languages because there is a new feature in Azure Data Factory like Data Flows.

AW: Absolutely. But then another part of Purple Frog is that we also provide a managed service. So if you’ve got a business intelligence solution with a Data Warehouse and an ETL system, a cube, then as a business you need that to be up and running and maximize uptime, because it becomes a business critical solution. If your board or CxOs are relying on this system to produce all of their management KPIs and their understanding of the performance of the business, when that’s down for two days, it can cause you problems. Some of our customers rely on it hour by hour during the day in a real-time environment to manage their real-time stock distribution and manufacturing warehouses. Banking customers need real-time information from their warehouse to actually manage their investments and portfolios. So a good data warehouse and a good cube or reporting solution will very quickly become a business critical system. And so we provide a monitoring system where we look after those solutions for customers and monitor them in real time on a second-by-second basis and monitor the data quality, the data accuracy as well as the actual functionality and successful completion of the data load jobs and everything else that goes with it. Even the duration of how long the load takes, how long the cube’s taking, query duration, that kind of thing. So we have quite a comprehensive monitoring solution that we built in-house using C# that just sits there, constantly checking all the time in your servers, and as soon as there’s a potential problem, it alerts us with our support desk and we can jump in straight away and actually help fix that for you straight away, before you even know there’s our problem. So things like that, it’s not actually a BI solution but it’s a business critical C# solution that we’ve developed, that help us manage and maximize the uptime of our customers’ environments. So everywhere we look there’s code, there’s C#, things like that coming into it.

KN: From the business perspective, you can save a lot of money using those solutions.

AW: Absolutely. So we had a choice of buying an off-the-shelf solution or numerous off-the-shelf solutions that are available. Most of the vendors of monitoring software tend to focus more on the DBA side of the world. And there’s some fantastic tools out there that monitor databases from a DBA perspective, but from a BI perspective, it’s not necessarily enough just to be able to look at how fast your queries are running in the data warehouse. How fast your query is running in the cube. It’s also really critical to look at is the data right. Has something changed in your data that isn’t right. It may be working technically, but is the actual result coming out correct. So things like in a retail environment. We know that your retail sales for last year shouldn’t change. Therefore, we monitor every five minutes. We run a check to say “OK, the volume of sales last year was this value”. As soon as that changes… any range you define. We work with customers to define specific or customized checks and so, let’s say we’re looking at last year’s sales. We know that we’re already in October though, so last year’s sale shouldn’t change. Every five minutes we’ll check that number and make sure that’s consistent in the cube. That never changes. It means you’ve got some rogue data coming in to the system. Transactions for this week have gone against the wrong year. Something’s gone wrong in that chain of process. The data’s loaded okay, but it’s actually not valid data, and so we check the data quality. Things like year-on-year performance or year-on-year changes, we can run a minute-by-minute checks or hourly checks to say “okay, the value of sales in this week should be roughly equivalent to the comparable week last year, plus or minus 10% growth for example”. And as long as your data falls within that region, that boundary, we know that it’s pretty much gonna be okay. As soon as it falls outside the boundary, it indicates there may be some rogue data or something’s happened with invalidated environment. And so it’s both about technical functionality, performance and also data quality. Can our customers rely on their cube to give them accurate data. And so it’s a very specific business intelligence orientated monitoring service rather than purely looking at locks and blocks and weights in a database.

KN: That’s an interesting approach to monitoring basically, how to monitor the quality of data.

AW: It monitors not only the technical side of it, but the confidence level. We can find these issues and resolve them before most businesses even know they exist. And so by building up, as we get a managed service contract with the customer, we work with them over time to constantly build this set of checks to add more and more over time to improve the robustness and reliability of their solutions. That’s a constant growing involvement and yeah, it’s interesting.

KN: OK, so our listeners and readers can’t see it but currently we are sitting at Microsoft HQ in Reading. This is the fourth day of Data Relay. Tell me something about this event.

AW: So Data Relay, which until this week was called SQL Relay, started 9 years now as a brainchild of two fantastic people who have been long-term founding members of the Microsoft SQL Server Community Tony Rogerson and Chris Testa-O’Neill, who have been instrumental in shaping the whole SQL Server Microsoft Data Platform community in the UK. Both of them long-standing MVPs, Chris now a Microsoft employee. They were discussing, I think they were over at the MVP summit in Seattle ten years ago, and came with the concept of SQL Relay as a community event. So in the UK, we’re very, very lucky to have a large number of Microsoft data events. We have SQLBits, which is the second biggest conference of its kind in the world, and we have a lot of user groups or meetups. They run all around the country in the evenings over a couple of hours. And there wasn’t really anything in between, so SQLBits is once a year, huge events, a couple thousand people, great event. And it just keeps getting better and better, we love it. But there was nothing really in between. And a lot of the user groups, we’re operating entirely independently and not really communicating too well as a whole about coordinating speakers, approach to things and we’re just working together. So Chris and Tony came up with the concept of SQL Relay to really bring the Data Platform community leaders together to run a centralized event. And it started out in its first year as a coordinated set of evening user groups over a week, across the entire UK, culminating in an event in Thames Valley Park in Reading here, where we are today. And if I remember correctly, we had the amazing Itzik Ben-Gan doing a keynote speech on introducing Windowing Functions to SQL Server and our minds were just blown away. And it worked, it got all of the community leaders working together and discussing how we can actually evolve the Microsoft community in the UK and how we can improve it to make it bigger and better. And so the second year SQL Relay ran again and instead of just being evening news groups, we extended it to all-day events. They were spread over two weeks. And so we had more speakers involved, we managed to get some funding from Microsoft and it turned into a full-day single-track event and it just grew. And there was an opportunity to really evolve it and turn it into a really powerful feature in the UK data community calendar. So I was involved from year one, obviously I run the Birmingham user group, so I was involved from that perspective. I think the first year Tony took the lead as chair of SQL Relay to really coordinate the entire event. Second year, if I remember correctly, it was Chris who took over, Chris Testa-O’Neill is the lead. And then in year 3 they handed the baton to me. So I took over as the chair of SQL Relay and yeah, took it over and ran it for a year. And it was a really fascinating challenge for, ultimately, a data geek. I play with databases and I run my own business, so I have that side of it with the management and the financial management and everything else that goes with it. But running a conference is a whole different set of challenges. I took it on without really realizing the extent of how difficult it was going to be, and time-consuming, but incredibly rewarding and I loved it. Really enjoyed it. And so in that year 3 we expanded and we still had two weeks worth of events, so I think we covered eight venues across the UK over a two-week period. And each event we had a couple of different tracks, so two rooms in parallel with various speakers coming in throughout there. And it was a huge success.

KN: When it was, the eight days?

AW: Now you’re stretching my memory, crikey! I know we did a Scottish venue. I can’t remember, whether it was Glasgow or Edinburgh, we did both of those in years three and four, I can’t remember which way round it was. I’ve got a feeling it may have been Glasgow that year. We did, I’ve got a feeling, Newcastle, Leeds, Birmingham, Reading, London, Cardiff, Southampton… I can’t remember where else. That’s a good starting point anyway. Something like that anyway, and so I got more and more involved in that year obviously with leading it and leading the sponsorship and getting more sponsors involved to help fund it. And over the years, the committee members have changed, a lot of people have come into the committee, a lot of people have left, a lot of people have joined Microsoft, I think three SQL Relay committee members have now joined Microsoft. And so the people involved have changed, but there’s always been other community leaders that run user groups. And one of our very initial reasons for doing it, was to help promote user groups in the community in the UK. And that’s all very much a part of it. Every keynote or introduction, we’re always trying to promote other user groups. I was actually talking to a gentleman earlier today who lives in Northampton. And there’s currently no meetup in Northampton so he’s quite keen to get involved in the community, so I’m now talking to him about how we go about setting up a new user group there. It’s about what we need to do. It’s setting up a venue, how to promote it, how to get the speakers, getting contacts, that kind of thing, venues don’t come for free, we like to put on catering if you can, get some pizzas in or something and that all takes a bit of money, so which sponsors are around that can actually help with that, what kind of venue do you need, how do you go about it. So when I set up the Birmingham venue, Tony Rogerson did the same for me. He gave me a lot of advice on how to go about setting up the event. And now he passed the baton on and I’ve had a few years’ experience, so it’s really nice to be able to help out other people. And that’s one of the nice things about the UK. Not just UK, the whole worldwide Microsoft community is everyone wants to help. Everyone wants to pass on information and knowledge and work together to make it happen. And that is really typified by SQL Relay or Data Relay now. It’s a bunch of people who care about the technology, who care about wanting to promote the technology and want to help other people learn about it. And if that can help user groups and meetups as well, encourage new speakers to come through, give them a platform to speak on. There’s a couple of speakers this week who have never spoken before on a big stage and it’s really nice to be able to give them a platform to help encourage and develop their own speaking skills, to get new speakers coming through talking about new topics or even existing topics, but they’ve got a different way of presenting it. They’ve got a different set of experience or they solve different problems.

KN: And this this works especially for the SQL Server area, we call it SQL family. It works perfectly I think.

AW: That’s what it’s about. It’s about always working together to help each other learn. I was very lucky early on in my career when I was trying to work out how to build these solutions early on when I was first starting. I never had an employer that would pay for training and so I relied very heavily on people who would give their time to write blog posts or to talk at conferences. SQLBits to me was a such an important conference to go and attend to learn how to do things and to learn from the likes of Chris Webb for example who’s speaking here today and I’ve just seen him do a session on M. Fantastic session. I learned a lot of what I know about cubes from watching Chris do talks and reading blog posts and talking to him about it. And a lot of the stuff we learn as a community is shared freely amongst each other in blogs, in books, in conferences. Also I learned a lot from going to these conferences, it’s really nice now to be in a position where I can help feed that back and help pass on that knowledge to the next generation and help now run the conferences, and also talk about technology as well.

KN: That question would be very good for the next one. So what hints would you give to young people who wanted to start working on the IT market.

AW: It’s very challenging. Back when I started, that was some time ago. The Microsoft Data Platform world was actually a relatively well-defined small world. It was SQL Server, it was Integration Services. Actually, it was DTS, but I won’t go back that far. It was Analysis Services or OLAP Services as it was, and that was really it. The challenge of any project wasn’t really determining the architecture, it was understanding the customers’ requirements and building the data model. That’s why a lot of my focus is all around data modeling, because so many problems in a project happened because of poor data modeling and that was always the core focus of my work. When you’re getting into this technology now, it’s not just SQL Server, it’s not just Analysis Services. Is it SQL DB in Azure, is it Managed Instances, is it SQL DW, is it SQL in a VM, is it Data Lake, is it Databricks, is it R, is it Python, is it Data Factory, is it Integration Services, is it Power BI, is it Flow…

KN: Even not counting the cloud solutions, in the package of SQL Server on premise you have now PolyBase, the R language, Columnstore, all these technologies that you need to learn.

AW: I mean it’s great that all this technology exists, but it makes it actually quite difficult to be an expert in everything. And so knowing what you want to focus in is actually a really big challenge. So anyone starting out in this industry, if you’re trying to choose an area to go into, first of all, the Microsoft Data Platform world is a wonderful place to be in. Compare it to other competing environments. Just the community itself is so much more sharing open and welcoming, you will find it much easier to learn and find it easier to make contacts in the Microsoft world than other worlds, which is why I still work in a Microsoft world. By coming to events like SQL Relay, by going to SQLBits, SQL Saturday, all these community events, it gives you an opportunity to spend an hour learning about a particular topic. Now you may not leave that hour being an expert in it, but you’ll understand the concept of what it is. And so it’s getting more and more important to be aware of all the different components in the Data Platform. Not to be an expert in every one of them, but to be aware that they exist and to know, how they may fit into a jigsaw, so that you can focus on the area that you’re currently working on, but when you get a problem, it’s having an awareness of what other technology is available to you to be able to solve that problem. And you can think back: “OK, last year I saw a talk on, I don’t know, Flow for example. Great, I’ve seen that work in a demo, I think that may be suitable”. I’ll now spend a few days looking into it, playing with it and actually trying it out. But unless you’ve seen a talk at a conference or read a blog, how would you know Flow even exists? How would you know whether Databricks or Data Lake are valid options for a big data analytics solution? So it’s important to have an overview of all these different technologies, so if you do go to the conference, don’t just go to the sessions that tick the box of what you’re working on now. Broaden your horizon, go and see the sessions that don’t tick your boxes, the things you don’t know you need to know about and you will find out, that a number of those topics actually do become very, very useful for you in future months or years. Because at some point you’ll hit a problem and you’ll think back to that session and think “Yeah, I know that technology is gonna work”, but also you know who the speaker was. You know if you need help with it. You know an expert you can ask. Most of our speakers are, if they’re not consultants or contractors, they’re very helpful people and they like talking about this technology. Not a single one of them wouldn’t respond to an email asking for some help or advice. So it’s a very good way of finding out useful information about other topics. So that means you’ve got a wide breadth of awareness of tools and that then means, whatever job you go and look for, you can start to expand your breadth of knowledge, and over time focus. Not once in my career did I ever decide to be a business intelligence consultant, I kind of fell into it through a series of opportunities, mistakes, scenarios that I didn’t plan or expect. And I’ve ended up doing this job that I absolutely love with a passion. But I never chose to do it. I just happened to end up doing this through a sequence of random events. So don’t ever block off channels or don’t close your mind to opportunities. If there’s an opportunity to get into a particular technology, take it with both hands and the community will help you learn.

KN: So you also have your own blog, yes?

AW: I do have a blog. Having said that, it may not be as up-to-date as it used to be. I used to religiously blog at least once a month. These were always quite in-depth technical blogs that may take a number of days to write. Sometimes even weeks to write, days or weeks, yeah. So some blogs I was preparing for even months of refining code and getting it right and getting it ready and then it’s ready for a blog. I would use the blogs as a way of actually starting out with ideas about problems I’d solved or different interesting techniques of data. And then, if the blog post took off and gained traction with comments and views, then I would turn that into a conference talk and I would go and present that in SQL Saturday, user groups, SQLBits, PASS Summit, etc. Then, as I started to get more involved in actually running the community, so now I run the Birmingham Data Platform meet ups, I’m director of SQL Relay, I’m also on the committee for SQLBits. Those three combined, especially Data Relay and SQLBits take up a phenomenal amount of my time and they pretty much take up all the time that I used to have to write blogs. So I think I’ve only written two blog posts in the last six months. I do need to get back to that, because I find that a really good way of learning myself. If you want to learn new technology, set yourself a challenge, set yourself a scenario to solve, figure out a solution to it, write a blog about it. It’s gonna help someone else. It also helps crystallize in your own mind, how a technology works. Doing a conference talk really helps you learn a new technology when you’re preparing the talk. But at the moment, I tend to focus on doing talks at conferences rather than writing blogs.

KN: Completely understandable. I think you shouldn’t worry about it because during my session I was leading two days ago at Birmingham, the second day, sorry, Leeds. When I mentioned SCDs, one of the attendees mentioned your blog post. Most probably we read the same posts. I still remember I saw a lot of comments under that post.

AW: I did a series of blog posts, I think three blog posts on Slowly Changing Dimensions in Data Warehouses and how to use the T-SQL MERGE statements to automate that. To this day that is by a country mile the most popular blog post that I have ever written, or series of blog posts. I actually did my master’s degree thesis on that topic, on the performance characteristics of loading SCDs and the blog posts came out as a result of that. Actually, my second ever SQLBits talk was on that topic. What’s really great, what I love is I wrote that blog post about eight years ago, something like that, and still to this day I get comments on “is this still the best approach to doing it?” Well, I had that comment last week.

KN: The MERGE?

AW: Yes, the MERGE. Absolutely. I use it for cloud solutions, on-premise solutions, every single day.

KN: Have you heard about SCD Merge Wizard? It’s an application that helps you create a MERGE statement for mapping your source and target. Then you can define the business logic behind it using type 1, 2, 3 and 6.

AW: I have not come across that.

KN: I’m a member of this project, I became a member of this project like 2 or 3 years ago, because one of the original author didn’t have time to develop that project, so I was happy to continue that work.

AW: What’s great about that is it shows the concept of community helping each other, so when I first did that post, using MERGE to do that was never considered an option, no-one had really done it before, but to me it seemed a perfect opportunity to use new functionality that was being introduced in the SQL Server engine to solve a complex problem in BI loading. And when you’re trying to manage SCD loading and Integration Services, it’s not a pretty thing, it doesn’t perform very well. So this to me seemed a perfect marriage of these concepts so I spent a lot of time investigating and researching the process of doing that and it works brilliantly. That concept of using MERGE is now completely taken off and everywhere I go, I speak to people who were using MERGE to load their data warehouses and then other people, like you, the same with your project, you’re then enhancing that and taking it forward, improving it again. I did a blog post on series of talks, my first ever SQLBits talk was about automating the documentation of cubes. No one likes writing documentation, it’s a boring process, so I wrote a process of using DMVs, Dynamic Management Views within SQL Server Analysis Services to query the metadata and structure of a cube, pull that into Reporting Services and then visualize the structure of a cube, basically creating automated real-time documentation, purely to save me time writing documentation. I give that to every customer and I put all the code on a blog and that was probably nine years ago, something like that. And it still works! But that’s Multidimensional Cubes. I was talking to Steve Powell, another community leader in the data platform. And he’s taken that code off the blog, converted it, enhanced it, improved it, added some PowerShell to make it work against Power BI, so you can now use that similar concept, but enhancing it another stage further, so he’s taken that code, built on it, I’m now taking his code and building it further, and so we’re all helping each other out and we all gain better code and we get more features and functionality. And when you start sharing stuff, you get it back again, everyone gains from it, and that’s what I love about this community.

KN: That is the perfect idea behind the GitHub. Open source code.

AW: Yeah.

KN: So we are talking about the communities, SQL family members, etc. What do you think about the MVP program these days?

AW: It is fantastic. I am very, very fortunate to have been awarded the MVP award by Microsoft. I’m a passionate supporter of the program, both from the value to the community, it gives you a level of contact within Microsoft. Obviously, there’s very strict non-disclosure agreements that we have to sign with Microsoft, so we won’t disclose any secret information outside of the program, but that means that the designers, architects and program managers within Microsoft can then share information with us about the future direction of their platforms – roadmaps. And that does two things: first of all, it means we, as a community, are able to feedback to Microsoft and say what we love about that, what we’re a bit concerned about, where we think that could be tweaked, enhanced, and provide very early feedback on the pros, cons and benefits of their approach. It also means we can become familiar with the concept and the approach to what’s happening and start writing blogs, conference talks.

KN: Sharing the knowledge, yes?

AW: Keeping that private initially, but have content ready, so that as soon as Microsoft release a bit of functionality, we can hit Publish on our blog post and straight away other people in the community want to learn about that, there’s already content for them to go and look at and learn. And as soon as you get announcements at Microsoft Ignite, places like that, you will straight away see a lot of blog posts, a lot of people talking on Twitter about the topics, because we’ve been given the opportunity to actually have inside information to see that beforehand. And I love that, because it gives us an opportunity to actually get a head start and provide more content to the community. What it also means, from a consultant perspective, is that when I’m designing architectures and designing solutions for a customer, I can’t tell them why I’m making decisions, but I can make certain decisions on choosing one architecture over another based on decisions that I know are going to be clear to them in six months’ time, but not now. I’ve got two customers at the moment who want to go down a particular route with the BI solution that we have for them, and I’m specifically holding off on that for a few months, because I’m aware of an announcement that’s going to be made on SQLBits in February that will change what they can do with that solution. And I know we will make their solutions easier, more simple, faster and better. And so it allows me to help make better decisions on behalf of my customers. And so from that perspective, it’s very valuable. From the consulting perspective, what it does mean is we also get access to play around with a lot of Microsoft tools. So obviously when you’re playing with an on-prem SQL Server instance there’s no cost involved. You’ve already got that environment there. When you’re playing with Azure, Azure is not free. All the services charge by the hour or by the instance you have. Data Warehouse and Hosted Analysis Services, if you scale those up, then they can cost some decent amount of money. And so when we’re trying to put together conference talks or blog posts or really test the scalability and performance characteristics of these tools, that can be very expensive. So as an MVP Microsoft give us, not unlimited unfortunately, but a reasonable amount of Azure credits that we can use to play with this technology to really get to grips with it, which means we can then pass that on to other people. And I think probably the most valuable part of the MVP year is the MVP summit in Seattle. An opportunity where all MVPs around the world and every group, not just Data Platform, get invited over to Redmond and we spend a week with Microsoft, with the team leaders, the program managers, the architects, the developers and they talk us through the roadmaps. But also we get to have a lot of one-to-one conversations with the teams about technology, give them feedback, get an understanding from them as to why decisions have been made and the reasons for things. And that helps us then explain to the rest of the community how things can work. So it’s an incredibly valuable program, and I’m thrilled to be a part of it.

KN: Sounds very good. And you, as a speaker, how do you prepare yourself for a speech?

AW: That’s changed over the years. I always used to base a talk off blog posts, so I’d write a blog post and then evolve that and turn that into a talk. As my time blogging has decreased, I tend to just jump straight in and do a talk now. It’s still based on identifying things that I enjoy talking about and technologies that I find fascinating or interesting. That may be relevant to a particular project that I’m working on at the time, so at the moment we’re doing a lot of Analysis Services in Azure migrations. Migrations from Multidimensional Cubes to Tabular Cubes, Power BI to Tabular Cubes. So I’m doing a number of talks about what Analysis Services in Azure can do, how to manage it, scalability, etc. Because I’m investigating that for my customers anyway. I’m doing that as part of my job and everything that I learn about it is, obviously hopefully, useful for someone else. I can explore different options and the ways of managing it and then share that with other people. I did a talk this morning in the keynote here about designing a deep learning neural network to play games. Now, there’s no business reason why I’m teaching a neural network to play games, that’s just pure fun, but in my world as a BI consultant, it used to be all about getting reliable, clean, accurate data into a Data Warehouse and present it with self-service reporting. That’s still very much a core, fundamental part of BI, but once you’ve got that solid foundation of data, these days machine learning algorithms are so easy and accessible, whether that’s through SQL Server, R, or Python code, whether that’s through Databricks or Data Lake or R-studio, doesn’t matter, it’s accessible. And it becomes very easy to start doing customer churn predictions or time series modeling or future predictions or basket analysis. All these kinds of machine learning algorithms are very accessible now.

KN: Very accessible but still I think, I feel that very new for most of the T-SQL developers or BI developers.

AW: Absolutely, and that’s exactly why I did that demo in the keynote today. It was to show, with a hundred and eighty lines of Python code, you can create a deep learning neural network that will learn how to play a game with no guidance. And it works. And so ten years ago, five years ago you had to be a data science expert to write that kind of code. Now, with tools like Cognitive Services, with TensorFlow, tools like that, it’s actually much more accessible, so BI people, like me, DBAs don’t need to go and do a PhD to understand data science. They can write code now. OK, to do a really complex model, I’m not belittling the value of a data science professional, but you can start out playing with this tech to understand this in a really easy way. So as my day job is getting more and more involved in data science, I’m learning myself how different tools work, such as TensorFlow. So I wrote that game or started playing with that game in order to learn myself better. And I just really enjoyed that demo, so I thought it would be useful to share. To share with people that it isn’t as scary as it may seem, so a lot of my talks actually stem from that now. When I’m learning a new technology, I’ll find something that I think is valuable and I’ll just turn it into a talk and evolve it. And then every time I go and do a talk, I get feedback on it and I’ll extend it, enhance it, based on the feedback and it evolves over time.

KN: Can you show that code, attached to this post?

AW: There’s actually a lot of sample code already out there that does that. So actually I started by working through a YouTube tutorial with a guy, I can’t remember the name of the chap, but if you search for Open AI Gym and Cart Pole on YouTube, you will find a four-post series of videos where this guy talks through a step-by-step process on all the Python code to use this and to write that game. And in the Open AI Gym library there’s a number of games you can play and you can modify his code to do various things. So that’s actually how I started playing around with this code and then evolved the code, and developed it myself to do different things. So there is plenty of code and tutorials already there. It’s just I don’t think people are looking for it because they think it’s really complicated and really out of their limit, but I wanted to show that actually anybody can jump in and spend a couple of hours playing with it.

KN: That’s why I would like to attach your code, because you showed this small game based on neural networks?

AW: Deep learning neural networks. What I’ll do is I’ll give you a link to this YouTube series. That talks it through in a really clear way with pretty much the same result as I was getting. So I’ll give you a link to that and then give that guy some credits for training me on how to do it!

KN: Tell me about your work-life balance.

AW: If you ask me or ask my wife you may get different answers. It’s something that I always struggle with. I passionately enjoy my job and I run my own business. Anyone that runs their own business knows, there’s no such thing as a day off. You’re always on duty. I started out as a freelance consultant and you’re working seven days a week, building a new business. I’m quite fortunate now to have a team of six and I’ve got a team that can carry on doing the work without me in the business, although it’s still very difficult to detach myself from it. About two years ago, I had to make a life shift, where I was working seven days a week. I was working 12-14 hours a day in the office and I had a discussion with my wife Hollie who is the biggest supporter and the biggest influence in my life. Every decision I make that’s been good in my life has basically come from Hollie. So I owe her a lot, she’s my biggest supporter, pusher and mentor, I suppose. And it was a case of I need to slow down a little bit of work and find a hobby outside of work to try and detach myself from being in the office all the time. So I took up flying. I wanted something that forced me to get away from work and to think about something else. My dad used to fly when he was in his twenties and I’ve got a couple of very good friends who are pilots so they encouraged me to take up flying. So I started learning. I got my pilot’s license after, I think, about six months of training and yeah, I’ve been flying ever since. And I absolutely adore it. It’s a total detachment from reality and you have to focus and concentrate so hard on everything that’s going on, that I find it incredibly relaxing because it forces me to stop thinking about work. You’re focusing on radio, you’re focusing on actually flying the plane, on navigation, on monitoring, on fuel management…

KN: There’s autopilot!

AW: Oh, that’s only for jet pilots. Yes, I’ve got autopilot on the plane, but I never use it. There’s no fun, is it? You actually want to fly the thing. So I very rarely turn it on. I prefer to fly myself. But I find it’s like juggling ten balls, there’s so much to think about that you can’t worry about work, you can’t worry about everything else that’s going on in life. It forces you to detach. But it’s also an incredibly social thing. I go flying with my friends, we’ll fly off on Sunday morning for a bacon sandwich somewhere the other side of the country. To drive down to Cornwall takes about three and a half to four hours. I flew there with my sister the other week for an ice cream. We flew down there, had an ice cream on the beach and flew back again.

KN: I’ve seen that you also fly to the customers?

AW: Actually, one decision-making factor in which customers we take on may be related to how close they are to an airfield. It’s not a serious consideration, but if there is an airfield nearby, then that’s nice to be able to fly to a customer. You can’t always manage it with weather, but it’s good fun.

KN: So you are a celebrity.

AW: I wouldn’t quite say that.

KN: Wait a minute: local newspaper has written about you in a Spitfire. Let me quote: “Alex swaps digital cloud for the real thing – a Spitfire treat”

AW: Yes, that was a fun day. It was my birthday weekend and my wife Hollie, she’s a musician, so she plays clarinet for the Birmingham Philharmonic Orchestra, a very talented musician. And so because she’s out every Sunday rehearsing, I had this day free on the Sunday and I thought I’d go flying and I was looking at where to go on the day and I spoke to a couple of friends.

Alex Whittles takes to the air in the Spitfire. Source: Shropshire live

My friend John was free, he’s a pilot as well, and so we thought, we’d go for a jolly for the day. We’re always looking around as to where to go and I’ve never been to Duxford and there’s obviously the air museum there. And John suggested that we go there, so I flew us down to Duxford. It took an hour and ten minutes, something like that, it would have taken three hours in the car. And as we were coming into land, the air traffic control came on the radio and basically said “I’m sorry, we’re giving priority to a Spitfire coming in”. You could always get priority to a Spitfire. If you’re flying a Spitfire, you own the skies, you can have the runway. And so the Spitfire came in in front of us and we had to do a last-minute diversion off onto the grass runway parallel to it. And so we landed parallel with this Spitfire, just behind the Spitfire. It was just a magical experience. So just flying close to the Spitfire, seeing it in the air, it was just something else. And so we landed and then parked up the plane and then wanted to go and have a look at the Spitfire and have a chat with the pilot, get a photograph. And it turns out that they sell flights with a Spitfire from Duxford, which I didn’t realise. And they had one flight that was free, it was available on that day. It was my birthday weekend so I thought “once in a lifetime opportunity”. I can’t really justify the cost of it, but how often do you get an opportunity to fly a Spitfire. And it turns out, my friend who went down with me, John, about six or seven years ago had flown that same Spitfire, which used to be owned by Red Dragon up in North Wales and he’d flown it up there. And so he was the little devil on my shoulder saying “do it, do it, do it” and so I signed up to do the flight. And also because I’ve got my pilot’s license, I was able to take control of it and we had a half-hour flight around Duxford, around Cambridgeshire, we did a victory roll in it and then he gave me the controls. We were flying in close formation with a Harvard, with my friend John in the Harvard, so I’ve got the most fantastic photograph from the Harvard looking at me in the back of this Spitfire. And to this day it’s just my favorite photograph that’s ever been taken. And that experience of actually feeling the raw power in that Merlin engine and the sensitivity of the controls, compared to the PA-28 that I normally fly, which is like flying a Cadillac, very relaxed and forgiving. And with a Spitfire you so much just touch the controls and you’re flung sideways. The sensitivity and that power is just… you cannot imagine this until you get into the seat. What really struck me was whenever you hear a Spitfire, it’s an unmistakable sound of that Merlin engine. It’s instantly recognizable, that’s a phenomenal sound. Once you get into the Spitfire – completely different. It doesn’t sound anything like it, it’s really strange. Fantastic experience. The only slight issue was it was a little bit of a squeeze to get into the cockpit. I’m 6′6″, Spitfires weren’t really designed for people of my stature, so you can look at the photograph and see my head actually just touching the roof of the cockpit, squeezing into the glass, so it was a tight squeeze, but it was an incredible experience. I actually flew back into Duxford again about three weeks ago and had exactly the same experience of flying in formation with the Spitfire as we were going around the circuit coming into the land and again, had to divert off to the grass strip again next to the Spitfire, but it was lovely. It was such an incredible experience.

KN: So sounds like you found a perfect hobby.

AW: Absolutely, I love it. And it gets me away from work. And it really does help the work-life balance. They say a change is as good as a rest. Taking your mind away from something. Now whether that’s flying or for my wife it’s music, it’s going to see her doing concerts, anything like that that takes your mind away from whatever is stressing you out on the day. The day-to-day grind of normal, day-to-day life, we all have it, we all have stress. It’s finding something that detaches you from that and gives your brain a break to think about something else, and you get back more refreshed. And I find having that little bit of time away whenever I go and fly every couple of weeks, just gives me a renewed vigor for work when I’m going back again. So I’m learning how that work-life balance needs to work at the moment, and it’s an evolving process but I’m trying.

KN: Like all of us! The last question. Children?

AW: We don’t have children. I have eight delightful godchildren and two nieces and two nephews. And I’m very fortunate, I get to spend a lot of time with them and they’re all wonderful. The beauty of them is that when they play up, you can give them back to their parents.

KN: Alex, at the end of our conversation, tell us where we can find you.

AW: Where you can find me? I’m on Twitter with @PurpleFrogAlex. My website is www.purplefrogsystems.com. You can find me on my blog, it’s www.purplefrogsystems.com/blog/. Or if you go onto the Contact Us on the SQLBits website, those emails come straight to me. Or you can find me at most events around the UK or around Europe speaking or helping.

KN: Thank you very much for this conversation. Thank you very much indeed.

AW: It’s been a delight speaking to you, thank you very much.

 

Useful links

Alex Twitter: @PurpleFrogAlex
Alex’s websites: Purple Frog System (company) | BLOG
AI articles/videos: Deep learning neural networks | the Open AI Gym library | TensorFlow
Article: Alex swaps digital cloud for the real thing – a Spitfire treat
Alex’s post about SCD & MERGE: Using T-SQL Merge to load Data Warehouse dimensions
Conferences: SQLBits, Data Relay (formerly SQL Relay)

Previous Preparation for SQL Server installation
Next Last week reading (2018-12-16)

About author

Kamil Nowinski
Kamil Nowinski 100 posts

Blogger, speaker. Data Platform MVP, MCSE. Senior Data Engineer & data geek. Member of Data Community Poland, co-organizer of SQLDay, Happy husband & father.

View all posts by this author →

You might also like

Podcast 0 Comments

ASF 014: Itzik Ben-Gan interview

Introduction Itzik Ben-Gan is a Mentor and Co-Founder of SolidQ. A Microsoft Data Platform MVP (Most Valuable Professional) since 1999, Itzik has delivered numerous training events around the world focused

Podcast 0 Comments

ASF 007: Cathrine Wilhelmsen interview

Introduction Cathrine loves teaching and sharing knowledge. She works as a consultant, technical architect and developer, focusing on Data Warehouse and Business Intelligence projects. Her core skills are ETL, SSIS,

Podcast 0 Comments

ASF 005: Marcin Szeliga interview

Introduction Marcin Szeliga – Data Philosopher. Since 2006 invariably awarded Microsoft Most Valuable Professional title in the Data Platform category. A speaker at numerous conferences across Europe, as well as at

0 Comments

No Comments Yet!

You can be first to comment this post!

Leave a Reply

8 + 2 =

Protected with IP Blacklist CloudIP Blacklist Cloud