Artificial Intelligence’s Next Big Step: Reinforcement Learning

January 25, 2017, 8:00 am

≫ Next: JavaScript Will Finally Get Proper Asynchronous Programming

≪ Previous: R Server 9 Adds Machine Learning to Work with Your Data Where It Lives

Almost every machine learning breakthrough you hear about (and most of what’s currently called “artificial intelligence“) is supervised learning; where you start with a curated and labeled data set. But another technique, reinforcement learning, is just starting to make its way out of the research lab.

Reinforcement learning is where an agent learns by interacting with its environment. It isn’t told by a trainer what to do and it learns what actions to take to get the highest reward in the situation by trial and error, even when the reward isn’t obvious and immediate. It learns how to solve problems rather than being taught what solutions look like.

Reinforcement learning is how DeepMind created the AlphaGo system that beat a high-ranking Go player (and has recently been winning online Go matches anonymously). It’s how University of California Berkeley’s BRETT robot learns how to move its hands and arms to perform physical tasks like stacking blocks or screwing the lid onto a bottle, in just three hours (or ten minutes if it’s told where the objects are that it’s going to work with, and where they need to end up). Developers at a hackathon built a smart trash can be called AutoTrash that used reinforcement learning to sort compostable and recyclable rubbish into the right compartments.

Reinforcement learning is the reason Microsoft just bought Maluuba, which Microsoft plans to use it to aid in understanding natural language for search and chatbots, as a stepping stone to general intelligence.

Commercial deployments are far rarer, though. In 2016, Google started using DeepMind’s reinforcement learning to save power in some of its data centers by learning how to optimize around 120 different settings like how the fans and cooling systems run, adding up to a 15 percent improvement in power usage efficiency.

And without anyone really noticing, back in January 2016 Microsoft started using a very specific subset of reinforcement learning called contextual bandits to pick the personalized headlines for MSN.com; something multiple machine learning systems had failed to improve.

The contextual bandit system increased clickthrough by 25 percent — and a few months later, Microsoft turned it into an open source Multiworld Testing Decision Service built on the Vowpal Wabbit machine learning system, that you can run on Azure.

Microsoft’s John Langford discusses multiworld testing at QCon NYC last June.

“We have a deployable system, which I think is the first anywhere in the world,” claims John Langford, the Microsoft researcher who started work on Vowpal Wabbit when he was at Yahoo.

Multiworld testing runs multiple context-sensitive experiments at the same time and it lets you answer much more detailed questions than standard A/B testing. Contextual bandits are a mathematical representation of a slot machine with multiple arms, and before choosing which arm to pull each time the agent sees a feature vector showing the current context (from the multiworld testing), as well as the rewards for arms it’s played in the past.

Contextual bandits are one of two ‘islands of tractability’ in the reinforcement learning space “which is clearly still more of a research endeavor than supervised learning,” warned Langford. “There are a lot of problems that are readily posed as reinforcement learning problems for which we have no effective solution.”

“They work in situations where the reward is immediately clear and you get feedback on your actions,” he explained; “Where you have contextual control over small numbers of actions and where you get feedback about that action. We want to try to tame these techniques, to normalize them and make them easy to use; that’s what we’re trying to do with the decision service.”

Sometimes the feedback isn’t immediate, so you might need to use reward shaping, which “lets you take a long-term goal and decompose the reward for that into a bunch of short-term rewards — the sum of which, if you get it right, gives you the long-term goal. This is a key problem you need to resolve when trying to solve reinforcement learning problems.”

Sticking to those situations is how the team was able to create a reinforcement learning solution that works in the real world rather than just a research problem. “It’s very much a limitation of scope that makes it tractable,” Langford points out. “There’s a particular subset, contextual bandits, where we know that things are tractable and the decision service tackles that subset.”

“The important thing to note is there is no long-term decision-making aspect,” further explained Alekh Agarwal of Microsoft Research, who also worked on the decision service. “It’s a sequential process where you observe some context and take an action and immediately get a reward; there’s no long-term reward to worry about. There’s still a limitation; if you take an action you only see the reward for that action, not what would have happened if you took another action. You’re trying to optimize your choice of actions given the contextual information you have.” That’s another way reinforcement learning differs from supervised learning, he notes.

The problem isn’t just long-term rewards, but “the credit assignment across multiple actions,” added Langford. “If you take 30 actions and any of those can affect the rewards you observe, that’s a different situation. It’s easy to come up with problems where all the existing algorithms will fail badly.”

The Right Problems

The Multiworld Decision Service is live and costs about 20 cents an hour to run on Azure (click to enlarge).

There are problems where reinforcement learning is better than the better known supervised learning, though. “Right now there are a lot of problems people are trying to solve that don’t fit the supervised learning paradigm but they’re used to supervised learning so they try to use it,” Langford explained. “The canonical one is trying to predict clicks and optimize which ad you place.” Contextual bandits are ideal for that, and for making other personalized content recommendations.

Another applicable area is personalized user interfaces that adapt to your behavior. Imagine that you wanted to use an EEG to control a computer. “Every time you put the EEG on your head, it’s in a slightly different position and the system needs to learn to very quickly adjust to how where it’s been placed on your head,” suggested Langford. “The person using the interface is going to notice when things go wrong and they can issue a correction; that was right, that was wrong. There would be a natural co-learning process.”

Personalized healthcare is a trickier problem, especially given privacy issues and fragmented medical records, but at least theoretically, contextual bandits might help. “Not all healthcare is the sort where a single action leads to a clear outcome, but a portion is,” he noted. “Instead of the drug trials we have today, imagine trials that are ten or twenty times larger that learn a policy for deploying treatments personalized to individual subjects rather than the one-size fits all policy we do right now.”

Resource allocation — for humans or computers — is applicable well beyond Google’s data center management trials and it’s also a good fit for contextual bandits, says Agarwal. “When you send a request to a website, which server should handle it? Operating systems solve many resource allocation problems. In many of these cases you have to do some work to define the right functions, but some of them end up being reasonable fits for bandits.”

Getting the rewards right is key; “it tends to be where the art is in trying to solve reinforcement learning problems,” says Langford. “Sometimes it’s dead obvious, like clicks. Sometimes it’s more subtle. But figuring out how to frame the problem is the silver bullet for solving the problem.”

Show Me

If that’s just too difficult, researchers turn to the second type of reinforcement learning that we can currently do well: imitation learning, by demonstrating a technique. “It may be easier for a human to supply the information about the right thing to do than to come up with a good reward function that matches the problem.”

“You see this a lot in robotics where you demonstrate to a robot what you want it to do and it can learn to do something like the same thing, sometimes even better than a human can do,” he noted. You need to make a long sequence of decisions to succeed and it’s hard to break down the value of incremental decisions. Robots work from sensory feedback; they have cameras and sensors for where the actuators are and they translate this feedback into short-term rewards. The beauty of this is that the demos keep you out of local minima and maxima.”

Self-driving car systems work in the same way, noted Agarwal, and he points out that to be effective, imitation learning needs high-quality demos. “If you’re getting very expert demonstrations with optimal sequences of actions most of the time, you can learn to imitate them well and generalize them to unseen situations and not get stuck.”

Unlike contextual bandits, there isn’t just one technique for imitation learning. There isn’t a standardized platform like the Multiworld Decision Service that you can use on your own problems. But we’re starting to get platforms to help researchers experiment.

Play a Game?

Games are a common way to train reinforcement learning systems because they have built-in rewards. Atari 2600 games have been a popular choice, but often they’re fairly simplistic environments. At the end of 2016, both Google and Open.AI announced that they were opening up their reinforcement learning training systems to researchers, giving far more people access to complex simulated environments for training AI agents previously reserved for companies with the budget to build them.

Google’s DeepMind Lab — known internally as Labyrinth — looks like a simple 3-D game, with a floating orb representing the AI agent. The world includes 14 levels and four kinds of learning tasks like navigating a maze (static or generated on the fly), playing laser tag and collecting fruit, but researchers can get the code for this virtual environment from GitHub, create their own levels (using a game editor or programmatically in C and Python) and experiment with different reward schemes and gameplay logic.

The AI agent in DeepMind Lab is an orb that can view and navigate the 3D world (Photo: Google).

OpenAI’s Universe is also an experimentation platform for working on AI agents that try to learn to use computers the way humans do; by looking at the pixels on screen and operating virtual controls. As with Lab, the aim of Universe is to develop an AI agent that can not only learn to deal with one situation but use the learning techniques it’s developed to tackle unfamiliar environments as a stepping stone to creating AI that goes beyond a single, narrow domain — and OpenAI’s approach is to give researchers access to a lot of environments that were created for humans, not specially crafted for AI agents to learn in. Not only does that turn games and apps we already have into a training ground; it also means AI agents can watch people using the software to kick-start their learning — and we can compare the AI to human performance rather than just each other.

Universe lets you use any program with OpenAI’s Gym toolkit for building reinforcement learning agents in frameworks like TensorFlow and Theano. Gym already included simulated robots, Go and a range of classic Atari games and Universe extends that to over a thousand environments, including Flash games, 80 common browser tasks like typing in a password or a booking a flight, and games like Grand Theft Auto V.

Universe packages them up as Docker images, launches them on a VNC remote desktop and controls them through Python scripts — although not all of them support reinforcement learning yet. OCR runs in the Python program that controls the Docker container to scrape the game scores to use as rewards; of the 1,000 Flash games, 100 have reward functions and OpenAI has plans to use human players to demonstrate more of the games to AI agents, to make it easier to analyze what the rewards should be. In the future, Universe AI agents will get access to games like Portal and Wing Commander III, as well as Wolfram Mathematica, and maybe Android and Unity games as well.

They’re also going to be able to run inside Project Malmo, Microsoft’s reinforcement learning experimentation platform which runs on top of Minecraft (which it started work on in 2014 and open sourced in mid-2016).

“Some AI techniques that were purely on the research side for decades are starting to move closer and closer to real world applications,” says Katja Hofmann, from Microsoft’s research lab in Cambridge. “That’s very exciting. To really push those techniques forward, we need to flexibly, rapidly be able to experiment with techniques. That need for pushing forward experimentation was the motivation for Project Malmo. Now there are more and more of these platforms, which is exciting — and important for both pushing research forward and opening that research up to the broader community of developers and enthusiasts who can join in and start productizing.”

Currently, Universe and Project Malmo use slightly different APIs to integrate bots and agents in games and to run experiments. The first step will be making it easier to train an agent on one platform and then test it on the other. “There’s a lot to be gained by standardizing some of those APIs to make it as easy as possible for the community to switch back and forth.”

In the long run, that will let researchers create portable agent architectures. “We’re working with variants of deep reinforcement learning agents that can not only learn 2-D Atari games but also plug into agents that navigate the 3-D Minecraft world where they can look around and see where to go. Having the same kind of architecture for both will translate to effective learning in both those scenarios, so we can rapidly make progress, though experimentation, on aspects that focus on interactive learning.”

The two platforms have different research agendas. Project Malmo is about what Hofmann calls flexible learning.

“The idea is to develop AI that that doesn’t just address a single task but that flexibly learn and build up common sense knowledge and use that to tackle more and more complex problems. We think Minecraft is fantastic for this because it creates a sandbox. You have an entire world that’s infinite and that’s procedurally generated. You put the agent in the environment and it experiences is from a first-person perspective. Today, the agent can learn basic survival — how to navigate and avoid lava. As the technology matures, it can build up more complex skills like construction, learning how to build up complex items. Agents will be able to reuse their knowledge and basic understanding of the Minecraft world when they learn new tasks. That’s similar to how people learn about the world in one particular way and adopt that to the task at hand.”

Ultimately, she hopes that work will lead to collaborative AI, where agents learn to collaborate with human users. “If you want to achieve a goal, we’ll need an agent that can understand the goal and reason about the steps it needs to take in order to help you achieve that goal. That’s one of our key motivations.”

The OpenAI project has a rather different goal; they’re hoping to create a single AI agent with a generic problem-solving strategy, a first step towards general AI.

Experimentation Is the Way Forward

Like so much of AI, reinforcement learning isn’t new; the first textbook covering it dates to 1998 (and the second edition will finally come out this year). What’s different now is partly that we have experience with some problems that are well understood, particularly in the two areas of contextual bandits and imitation learning. But we also need these new experimentation platforms like Universe and Project Malmo and DeepMind Lab to give more researchers access, and to compare solutions in the same environment to benchmark progress.

Agarwal compares the availability of experimentation platforms for reinforcement learning to the impact large labeled data sets like ImageNet had on supervised learning. “The way we make a lot of progress in supervised learning was that we started accumulating large data sets and repositories and once we had those, we could try algorithms out on them reliably and iterate those algorithms.” A static data set isn’t useful for evaluating more general reinforcement learning; “two different agents will take two different trajectories through an environment.”

Instead, researchers need a large, diverse set of environments that’s also standardized so everyone in the field works against them. “Flexible, diverse platforms can serve the same function as a repository for reinforcement learning tasks where we can evaluate and iterate on ideas coming out of research much faster than was possible in the past, when we had to restrict the algorithms to simple evaluation problems because more complex ones weren’t available. Now we can take ideas to the platforms and see whether or not they do a good job,” Agarwal said.

Reinforcement learning will often be only one of the machine learning strategies in a solution. Even AlphaGo was initially trained to mimic human play using deep learning and a database of around 30 million moves from 160,000 games played by human Go masters. It was only once it reached a certain level of skill that it began playing against other instances of AlphaGo and using reinforcement learning to improve.

“It’s important in the long run to understand how an AI agent could be able to learn about those goals and about the peculiarities and abilities of the person it’s working with” — Katja Hofmann

That pattern might actually be key to making reinforcement learning ready for wider use, Hofmann suggested — either by using deep learning to prepare the actions and rewards for a reinforcement learning system, or by using reinforcement learning to reduce the work it takes to apply supervised learning to a domain.

“At the moment, if you put a reinforcement learning agent in a 3-D environment they would reason at the granularity of individual abstraction, taking one step forward, not on a higher level using the concept of walking to the next door. There are questions about abstraction that still need to be addressed before we can put the next pieces of reinforcement learning into applications, like understanding how to set goals in situations where a clean scoring function might not be available,” Hofmann explained.

The experimentation platforms help with that research, but so do the recent advances in deep learning. “For decades, the reinforcement learning problem of learning an appropriate presentation of a domain couldn’t be tackled in a systematic way because for every new app you could envision, you would need domain experts to find a representation [for that domain]. That was very resource intensive and didn’t scale to the breadth of applications we would like to unlock,” Hofmann explained, adding that “We’ve had major advanced in learning representations and automatically extracting the features that describe a domain and that allows us to scale algorithms to new domains.”

There is still plenty of research to be done, like how to get agents to explore their environment systematically, giving them the equivalent of curiosity or the intrinsic motivation to explore. And there’s also the problem that few developers are familiar with reinforcement learning at this point. It’s likely that systems like the Multiworld Decision Service will help bring reinforcement learning to a broader developer audience, Agarwal suggested.

One of the differences between supervised learning and reinforcement learning is that in the supervised learning world it’s usually a machine learning person providing the algorithm and the user bringing the data. “In reinforcement learning, it’s more that you bring your domain to the table and the agent by acting in the world is creating its own data. That creates a lot of complexity; the app developer has to build a lot more pieces of the solution if they want an algorithm to work in the right way and there are many things that can go wrong. It’s often better to design a whole system,” Agarwal suggested.

Langford is also optimistic. “Over the next say five years, I expect that education will happen and that we will see many more successful applications and see these kinds of systems become much more standardized in the world.”

And Hofmann has some big ambitions of her own. “You can envision and AI that’s able to learn; it would have some general knowledge about the environment, about what kinds of tasks people want to achieve and it would also be able to learn on the fly so it can personalize its help and support towards the goals of the person or the player.”

“In the real world, every person has different knowledge and abilities and desires,” she explained. “It’s important in the long run to understand how an AI agent could be able to learn about those goals and about the peculiarities and abilities of the person it’s working with and be able to personalize its assistance and actions to help that particular person achieve their goals.”

Feature image: Laser tag levels in DeepMind Lab test agents on fine control, strategy, planning and dealing with a complex visual environment. Image from Google.

The post Artificial Intelligence’s Next Big Step: Reinforcement Learning appeared first on The New Stack.

↧

JavaScript Will Finally Get Proper Asynchronous Programming

February 1, 2017, 1:00 am

≫ Next: With Azure Container Service, Microsoft Works to Make Container Management Boring

≪ Previous: Artificial Intelligence’s Next Big Step: Reinforcement Learning

The proposal to include async function in ECMAScript has reached stage four; that means it’s on track to be in the 2017 release of the standard. But what does that mean for JavaScript developers?

There’s a lot of interest in async, the capabilities JavaScript will need to easily execute multiple functions in parallel.

“Because JavaScript is single-threaded, that means if you have any long-running work it has to happen asynchronously for your app to remain responsive or it would just block and your browser would freeze,” said Anders Hejlsberg, the lead architect of C# and now also a core developer for Microsoft’s TypeScript transpiler for JavaScript. So the JavaScript runtime libraries and all the frameworks are designed such that they only have asynchronous ways of doing things. If you want to do an expensive operation like an XML HTTP request, you don’t get to block and await for the result; you get to supply a callback that calls you back later with the result.”

“There’s a huge amount of excitement out there; people are looking forward to when they can use async functions without transpilation,” said Brian Terlson, from Microsoft’s Edge team, who is the editor of the ECMAScript standard as well as “champion” for the async proposal on the TC39 committee that standardizes ECMAScript. When he tweeted that the async proposal had reached stage four, it got more retweets than anything else he’s tweeted.

Async functions are now stage 4 and will be included in ES2017!

— Brian Terlson (@bterlson) July 29, 2016

“Async programming models allow developers to ask all their questions at once. Developers then react to the answers as those answers are provided. The application is constantly adjusting to the information as it comes in. The user experiences a dynamic application that updates itself instead of being forced to wait an unbounded time for a perfect completed view,” said Naveed Ihsanullah of Mozilla’s platform engineering team, and partly because it’s going to make code more understandable.

“Asynchronous programming is very important for developing the best user experiences. Information is in many places and modern applications seek to seamlessly integrate all those disparate sources into one cohesive view. The instantly loaded and completed web page is all illusion, however. Behind the scenes, numerous requests for information are made on the user’s behalf. Some of these are answered quickly and some may take longer. Some may go unanswered altogether,” he said.

The whole web platform is moving in this direction, pointed out Terlson. Async going into ECMAScript 2017 is “a reflection of the fact that more and more things in the platform are asynchronous, so your code ends up having to deal with more asynchrony. Talking to a web worker is an asynchronous kind of thing, as is any kind of networking. Storage APIs are asynchronous. Service workers are doing a bunch of network stuff, so they’re asynchronous. The new Streams API has a lot of asynchronous pieces in it. As new APIs are added, we’re just discovering more and more sources of asynchrony as the platform grows in capability, so it permeates your code.”

The growth of APIs that add asynchrony means JavaScript needs better ways of handling that in code than callbacks. “If you have just one source of asynchrony a callback is OK but if you’re got a lot of them it sucks, and it’s also painful for performance reason with lots of functions created and thrown away.” Essentially, notes Terlson, reiterating over the HTML Document Object Model (DOM), again and again, isn’t efficient.

Making Async Bearable

He views async as a ‘vast improvement’ over callbacks because “there’s no pyramid-of-doom nesting callbacks” and Ihsanullah agreed.

“While async programming has many benefits, writing applications in this style is often complex and tedious. The JavaScript language has had low-level async facilities, such as XMLHttpRequest, for years. These lower level callback-based constructs were very difficult to work with, difficult to maintain and difficult to debug. They could degenerate the source code to Callback Hell as multiple nested requests were made. Asynct has the potential for greatly decreasing the barrier to writing high quality maintainable asynchronous code. These developer benefits translate directly to more responsive applications for users as more async is used.”

In many ways, this is JavaScript catching up with other languages, like C# which pioneered asynchronous programming, and the way asynchrony will work in JavaScript is very similar to how it’s handled in C#, Hejlsberg said. That makes code easier to read and to think about.

“JavaScript is a single-threaded execution environment. If you want anything to happen as a result of asynchrony, you’ve got to do through callbacks or someone has to call you, because only there’s one thread of execution. If someone else has it, they’ve got to give it up and let you run. So from day one, JavaScript always had callbacks like setTimeout or like DOM events; all that happens by someone calling you back.

The problem is how complex the code structure becomes with a lot of callbacks, and how hard that makes it to work with, Hejlsberg said. “The logic often becomes more complex; what if you have to have conditional branching or you have to have the equivalent of a for loop but with async calls in middle of the loop? You can try to do that mapping yourself, where you have to lift your state into shared object or shared variables and maintain that, but you basically have to write a state machine yourself. State machines are something computers are very good at reasoning about and humans are horrible at reasoning about!”

Async takes care of that, he explained. “It turns out that you can mechanically transform code that’s written in the regular sequential style into asynchronous code using CPS, Continuation Processing Style code rewrites. You can rewrite any program that uses synchronous function calls with returns and turn them into functions that take callbacks — and that’s what powers async. You get to write your code as if it is synchronous and then the compiler rewrites it into asynchronous callback-based code for you and turns your code into a state machine.”

That’s the best way to think about the new async/await feature, he suggests. “The places where you use the await operator to await an asynchronous piece of work, the compiler automatically makes a callback out of the rest of your code.”

“The big benefit is you get to write your code the way you always have. If you need an if statement, you write an if statement, if you need a for loop you write a for loop, and inside those you can say await and then have control return and then come back whenever the async work completes. People are very excited about that, because it makes your code look a lot cleaner and it’s a lot it’s easier to reason about your code.”

As Terlson notes, “For the most part you don’t have to worry about the fact that you’re calling an asynchronous API; you just await it and go about your day. If the promise is rejected you get an exception thrown by async, so you can write asynchronous code that looks like synchronous code and handle errors with normal synchronous imperative code.”

You still have refactoring to do. “When you make a function an async function, code that calls it gets a promise instead of a value, so you do need to change the calling code to async as well, and await the result.” But that’s far easier to do with async because the code itself is less complex. “You can even await non-promises. There are some APIs that return promises but sometimes if they know a value synchronously they just return it, so if you await the API result the right thing will happen.”

Being able to write code using these familiar, synchronous patterns while getting all the benefits of asynchrony “greatly simplifies writing code for dynamic and responsive web applications” said Ihsanullah. He suggests thinking of it as a “syntactic wrapper over JavaScript promises and generators” and points out that “understanding these features will greatly facilitate a developer’s understanding of async” and “experience with promises is probably mandatory.”

Understanding that async and its related function await is based on generators and promises will help you deal with more complex asynchronous code, he said. “Await currently only allows waiting on one thing a time. A developer recognizing that these async functions are promises, however, could then use await Promise.all(…) to wait on several actions.”

Browsers Getting Ready for Async

At one point, async seemed a slightly controversial proposal. “There was some concern around whether blessing promises as the async pattern is the right choice or something more like tasks that could swap out other things under the covers,” Terlson explained. But at that stage there hadn’t been many implementations beyond Microsoft’s Edge browser (starting as an experimental feature in Edge 13.10547 back in September 2015 and moving to an unprefixed version in the Windows Insider preview build 14986).

“Then Google V8 started implementing it and as we got more implementation experience, and people were convinced there were not problems for performance and so on, that helped.”

Plus, explained Ihsanullah, JavaScript needed those building blocks of async: promises and generators. “While promises, by virtue of being implementable in JavaScript, have been usable in browsers for a few years, generators are a more recent addition to the language.” And getting the syntax to reflect the way developers use functions is important. “It started with arrows. Then generators. And now async functions. To the standards committee and to the community, ergonomics of the language matters.”

Now async functions are enabled by default in Chrome, since Chrome 55 (Google’s Jake Archibald called them “quite frankly marvelous”) and they’ve been in the Firefox nightly releases since November 2016 (the plan is to support them in Firefox 52). Opera 42 and later support async, and it’s under development in Safari.

Transpilers

And with transpilers like Babel and TypeScript, you can even write async code and have it run in older browsers, as well as being confident it will work in the latest browsers as they add support.

It’s a lot more work to support async without generators, which is why it used to only be possible to transpile async code to ECMAScript 2015 in TypeScript, and Babel has different techniques depending on which version of ECMAScript you want to target. But now TypeScript 2.1 lets you go all the way back to ECMAScript 3, said Hejlsberg.

“Once you rewrite your code into a state machine, if you have generators then the transformation is relatively simple — rewriting an async function with awaits in it into a generator is almost trivial. But if you don’t have generators it is much more complex — because now you have to wrap a state machine around your code and effectively every place you see await, the function has to return and then when control comes back it has to jump back there and continue executing. And since there are no gotos in JavaScript, that’s complicated. So you have to write a while loop with a switch statement with a bunch of machine-invented states that you then maintain. The rewrite that happens to your code is complex and getting that rewrite correct is not simple.

“With TypeScript 2.1 we switched to a new emitter; this is the backend of the compiler. It’s a tree writer that rewrites your syntax trees to make these new state machines and the other fancy stuff you have to make, so we now natively support rewriting async await to ECMAScript 3. And it’s not just doing the simple cases where you can only use await at the top level not in the middle of an initialise or for a property of an object literal — no, it’s an operator like any other, so just like you can say plus, you can say wait.”

Browsers and transpilers aren’t the only place async needs to be supported for it to become mainstream of course; frameworks and libraries also need to support it. “If you want to write async style code but you have a bunch of frameworks that weren’t written in that style, those frameworks are still going to do callbacks,” he points out. “Async works fantastically if you have a promise based library that you’re coding against. Often, though, libraries are not promise-based and then you have to promisify them or find a promisified version of that same functionality. That will be the challenge, because that’s the glue that connects you from the callback to the async world.”

Expect that to take time to happen. “As with anything, it’s not happening overnight but there’s going to be an increasingly gradual shift and modern frameworks getting written now will use promises for all their async — and that means it will be a lot easier to consume them with async code. It’s going to be this wave that slowly washes over as opposed to something that happens overnight.”

Announcing that async will be in ECMAScript 2017 will help with this. “Library authors will have to transpile async for some time,” Terlson predicts, “but they can now be confident it’s a future direction for JavaScript so they can use it and not be concerned it’s going to be broken by future language changes.”

And of course, writing asynchronous JavaScript will be new to many developers. “If you’re not using a transpiler today, you haven’t used async functions,” Terlson points out. Anyone already using async is an early adopter, but as ECMAScript 2017 moves towards ratification, now is the time to start looking at how it can improve your code.

Feature image via Pixabay.

The post JavaScript Will Finally Get Proper Asynchronous Programming appeared first on The New Stack.

↧

With Azure Container Service, Microsoft Works to Make Container Management Boring

February 24, 2017, 3:05 am

≫ Next: Visual Studio 2017 Offers Live Unit Testing, and DevOps for the Database

≪ Previous: JavaScript Will Finally Get Proper Asynchronous Programming

Earlier this week, Microsoft made the Kubernetes container orchestration service generally available on Azure Container Service, alongside the other predominant container orchestration engines Docker Swarm and Mesosphere’s Data Center Operating System (DC/OS). The move is one more step in building out the service, Kubernetes co-founder Brendan Burns told The New Stack.

Burns moved from Google to Microsoft seven months ago to run ACS with the vision of turning it into “a really managed service” that can deliver not just tools for working with containers, but work as a whole Containers-as-a-Service (CaaS) platform.

Focusing on Apps and PaaS

As the technology matures, the emphasis shifts from how you use containers to what you use them for, he pointed out. “There a lot of talk in the Kubernetes community, and the container community in general, about how containers and orchestration need to become boring. It’s been a very hot and popular topic, but in some sense, it’s just a piece of infrastructure. It’s the apps you build on top that are really exciting and interesting.”

Getting that infrastructure in place, so you can see the benefits of containers over familiar-but-broken processes, needs to be fast. “If it takes six months or a year to get the benefit, nobody is going to do that,” Burns warns; “If you get the benefits immediately I think people will jump at it, and that’s one of the places ACS can really help. Setting up these things up, figuring out how to run and manage and deploy these container orchestrators can be tricky, finding the best practices around how to deploy them. Really, what ACS does for you is take that problem off your plate.”

ACS needs to help people through the transition to using containers, as an “application-oriented abstraction,” he explained. “We’re going from being machine-oriented to being application-oriented in the cloud and that’s a huge development, because except for the ops people who are thinking about machines, everyone else wants to be thinking about apps.”

And in the new cloud world, those aren’t just individual apps; they’ll also be PaaS products, because container services make creating PaaS far easier for developers — which means we’ll see more PaaS aimed at niche and vertical markets, rather than just broad, generic tools.

“The people who build PaaS no longer have to be distributed systems experts. Because the container orchestrators have taken over large numbers of the distributed systems problems, you’re going to see really targeted PaaS that provide a really incredible developer experience in specific, targeted verticals,” Burns said.

He believes this will also bring technology choices closer to the development team, rather than having to be a major strategic decision. “What you’ll see is those experiences then become deployed onto a container orchestrator side by side. If I need the one for my mobile apps and games, I deploy that onto my container service; if I need the one for web apps, I deploy that onto my container service — and they use the same underlying container orchestrator. That means the choice of platform isn’t as large a choice; individual development teams can make that choice as opposed to a CTO making it for an entire company.”

Choice of Containers

That kind of choice is why ACS supports multiple container orchestrators. “We find most customers have multiple needs, whether that’s because they’re a big enterprise with multiple different departments, or they’re a small company but they still need to do big data analysis. This is about finding the solutions that work best for every user.”

Kubernetes support for Windows Server containers (the Windows equivalent of the familiar Linux Docker containers) is now in preview, alongside Docker Swarm support.“When you run Kubernetes and Windows Server containers, you’re building Windows Container apps and you’re deploying them via the Kubernetes API and the Kubernetes tooling,” Burns said.

ACS also lets you have hybrid clusters that use Windows Server and Linux side by side. “In the ACS Engine, which is the open source core of ACS, we have hybrid clusters which have some Linux nodes and some Windows nodes. So, you can use service discovery and naming to build hybrid applications that use some components from Windows and some components from Linux.”

Further down the line, ACS is likely to support Hyper-V containers as well; a Docker container that runs in a very lightweight virtual machine based on Windows Nano Server for security and kernel isolation, but can otherwise be managed like any other container. That will need the nested virtualization that will be possible once Azure moves to running on Windows Server 2016, and that’s due to happen sometime in 2017. Burns called Hyper-V containers an exciting option with a lot of business uses.

“One of the last big open issues in containers is how do you get these things to be secure? Because everybody loves the deployment patterns and the utilization you can drive, but there are a lot of cases where you really want to make sure malicious actor can’t escape from one application into another application, and I think Hyper-V containers are going to be the solution for that. So, having the ability to orchestrate those will be important,” Burns said.

Because Hyper-V containers are still Docker containers and running a container as Hyper-V rather than a Windows Server container is something you choose when you deploy it, “you can make a decision, on a case by case basis, which are the ones you need to secure which are the ones you think are more trusted,” Burns noted.

The Azure Advantage

As is increasingly the case at Microsoft, the engineers working on ACS also work on the open source projects for the tools they’re integrating. That’s something Burns says customers value. “One of my engineers is the release manager for the next Kubernetes release; he’s going to do a lot of testing to make sure that upgrades work correctly, so when you go from one version of Kubernetes to the next version of Kubernetes it works. That not just going to benefit Azure users; that’s going to benefit the entire community.”

It’s also important that ACS integrates well with other Azure services, like the Azure Resource Manager (a team Burns also runs). He admits that’s challenging. “The APIs for orchestrators that we’re exposing in ACS are open source APIs so we don’t have as much control over the shape and feel of those as we do with traditional Azure APIs. ARM [Azure Resource Manager] makes some assumptions about what happens when you make an API call, in particular, it makes the assumption that you can always do the same call over and again and it will have the same effect; that every call is idempotent. Not all APIs do that and sometimes they expect certain paths.”

He’s hoping to find ways to bring that ARM model to some of the open source projects, starting with identity — which for Microsoft shops means Active Directory support. “Right now, when you authenticate to one of these clusters after you’ve created it, you use a different set of credentials than your Active Directory credentials. We want to make it so you can use your Active Directory credentials to authenticate to the cluster.”

Identity is going to be increasingly important for containers, and not just for ops who need to manage the lifecycle of ever-increasing numbers of containers. “Identity is an absolutely critical piece and not just for retirement and grouping; it’s also for communications. I want to be able to say this class of container can talk to this other class of container but nobody else can.”

Without that, the proliferation of identities will be just too hard to manage. “If every app has to have its own [identity system], then it’s not just one brand new one — it’s five or size or seven brand new ones. That’s a recipe for forgetting to remove somebody from a group after they leave the company.”

Container orchestration tools will need to mature to enable this, he noted. “They don’t have that notion of identity deeply baked in yet. That’s an area where we’re going to have to do a lot of work as we go forward, and with the upstream projects. This isn’t something Azure will do in a vacuum; it’s something we will do in conjunction with the open source communities, so that it works wherever people are running because Active Directory and Azure Active Directory are used all over.”

The immutable infrastructure containers provide is more reliable than the alternatives, but it’s also a big change for people who are used to logging into servers and running commands to install software. Containers might sound as if they’ve taken over the world but Burns compared the state of the market to virtualization in 2002 or 2003:

“It’s not ‘those crazy kids out there, virtualizing things’ but it’s not fully embraced by everyone either. We’re still in the early days of adoption — but the wins are very real and the wins are there for both developers and ops. With virtualization, the wins were more the ops side than on the development side, but with containers it’s a little bit more balanced. There’s wins on both sides, so progress will be faster.”

The post With Azure Container Service, Microsoft Works to Make Container Management Boring appeared first on The New Stack.

↧

Visual Studio 2017 Offers Live Unit Testing, and DevOps for the Database

March 8, 2017, 11:35 am

≫ Next: Beyond Bash: Microsoft Refines the Windows Subsystem for Linux

≪ Previous: With Azure Container Service, Microsoft Works to Make Container Management Boring

This week’s release of Visual Studio 2017 marks 20 years that software has been Microsoft’s flagship integrated development environment (IDE), and it might also mark the point when DevOps becomes truly mainstream.

Visual Studio still caters for a wide audience — from the hobby developers who use the free Community edition, to large outsourcing providers in India who need to distribute the software on DVD because their workplaces still don’t have internet connections, and to “super agile organizations who really are embracing the DevOps mentality,” Principal Program Manager for Visual Studio, Tim Sneath told The New Stack. “But we certainly see DevOps moving fast into the mainstream.”

DevOps is one of the key areas of focus for Visual Studio 2017, although Microsoft wants to make this about shifting right as well as the more usual “shift left” explained John Montgomery, the Director of Program Management for Visual Studio. “DevOps is still something that we’re learning about and getting better at [as an industry]. The end to end DevOps workflow is a pretty long flow; it starts with code editing, it goes all the way through the CI/CD pipeline and eventually comes out with an analytics service; the developer might look at the results of that and then go back to their editor.”

Live unit tests in Visual Studio are marked as passing and failing — and the dash shows code with no test coverage.

Tighter Loops

“‘Shift left’ takes all that stuff that’s happening later in the CI/CD pipeline and brings it back into the inner loop so the developer as they’re typing can start to see exactly what’s happening to their app — before check-in happens, before deployment happens. I think we can do better, to tighten that cycle so you can catch a bunch of issues before check-in and identify issues in production a lot faster, right in the code editing experience,” suggested Montgomery.

That includes tools in the editor like exception helpers that show root cause, live unit testing and code analysis into the editing experience. You can see as you edit your code whether you have full test coverage, or if a change you make to your code means it no longer passes a test, and instead of waiting until you build the project or check the code into your CI server to do static code analysis, you get hints and suggestions in the editor. Those can include team coding practices like using explicit types instead of ‘var’ (stored in the “EditorConfig” file in the repo for your project), as well as code suggestions that help you pick up language features like the new C# 7 syntax for throwing nulls, or tuples.

Suggestions can be based on team coding style.

Getting earlier visibility of problems that will affect deployment is only the first step. You also need the tooling to build the whole DevOps workflow. Visual Studio 2017 offers integration with other systems and services, from Microsoft’s own Xamarin Test Cloud (with real devices), Azure App Services for hosting mobile apps and Visual Studio Team Services (VSTS) for online team development, to Docker and Git, so you can containerize apps and auto-deploy from Git repositories.

You can do a lot more with Git from inside Visual Studio now; you can see the diff for outgoing commits, use a force push to complete a rebase or push an amended commit, remove your upstream branch, or continue patch rebase. That integration is now based on GIT.EXE Git core, so it supports SSH and keeps your existing configuration options.

Code suggestions help you pick up new language features.

DevOps increasingly means using open source code and external components. For Visual Studio 2017, Microsoft worked with WhiteSource to integrate its open source security and management service into VSTS (and the on-premise equivalent, Team Foundation Services) as a build task you can add to give you reports on any known vulnerabilities in open source components you’re relying on. (Visual Studio Enterprise subscribers can use WhiteSource Bolt on one project free for six months.)

For larger businesses who want all of this, Microsoft is bundling up Visual Studio, VSTS, the Azure services for CI/CD (including load testing, mobile app testing and WhiteSource Bolt), plus discounted Azure pricing, training as an Enterprise DevOps Accelerator; a sure sign that existing Microsoft customers are interested in adopting DevOps.

Devops for .NET

For Microsoft shops, it’s likely that Microsoft’s .NET and .NET Core will be a big part a big part of interest in DevOps.

Visual Studio is “cloud agnostic,” Montgomery promised, but he also noted the options Azure has for supporting .NET and .NET Core (which you can think of as a refactoring of .NET that fits in a lightweight Docker container and runs fast).

“When Visual Studio and .NET Core are deploying to Azure, we can do debugging and diagnostic magic. We can trace to the line of code of your faulting app in production. We can integrate App Insights telemetry back in the application without the developer having to write a single line of code.”

Using Azure App Services directly from Visual Studio, without having to go through the Azure portal, is only the first Connected Services option in Visual Studio 2017; Montgomery says there will be more. You can build containerized .NET and .NET Core applications using Docker containers for Windows and Linux from inside Visual Studio. Because Visual Studio knows you’re building a .NET app, it puts the ASP.NET image in the Docker file for you. You can even debug from Visual Studio on Windows into a Linux container, and deploy not just one container but multiple containers into an orchestrator. Even the build environment is supplied in a Docker image.

Covering the whole DevOps workflow is where the idea of “shift right” comes in. “’Shift right’ is the idea that building in quality from the very beginning is not enough,” explained Montgomery; “You need the continuous feedback from users that pulls all the process together; things like testing in production, and experimentation, and user telemetry collection.” They’re all “shift right” practices and Visual Studio is going to integrate more and more of them.

The Redgate tools included with Visual Studio 2017 make databases part of the DevOps workflow.

Database DevOps

One of the areas often left out of DevOps workflows is databases, but Visual Studio 2017 includes three tools to help you manage the DevOps cycle across databases as well. Redgate’s SQL Search tool, which helps you find SQL fragments and objects across multiple databases, is included in Visual Studio Community, Pro and Enterprise, which should improve database developer productivity.

Visual Studio Enterprise 2017 also includes ‘core editions’ of Redgate’s ReadyRoll database versioning and schema management tool, as well as IntelliSense-style SQL Prompt code completion.

ReadyRoll Core helps you develop migration scripts and manage database changes using source control, so it’s basically “config as code” for databases. The SQL Prompt Core extension helps you write, format and refactor SQL code, said Montgomery. SQL Prompt pops up suggestions as you type, completing operators like UPDATE and reminding you that the next command after UPDATE needs to be SET. You can run a script checker once you finish editing, as well. “These do for databases what VSTS does for source code.”

Use ReadyRoll as a build task to update your database as you build your code.

Usually, changes to code and changes to database schema are made and deployed independently, even though the code may depend on a particular version of the database schema. ReadyRoll helps co-ordinate that, explained Sneath.

As the user manipulates the database schema in production, ReadyRoll journals the differences between the current version and the previous version of the schema and can create migration scripts from one to the other. The user can check those into source code just like any other version.

“You always know what version of database schema you were on, you know what version you’re trying to get to and you can migrate forward just by running these scripts,” Montgomery said.

That means you can keep your front end and back end changes synchronized, even when developers make changes to a local version of the database, because ReadyRoll syncs their changes back to the project, refreshes the schema — and creates a changelog. You can see the contents of the scripts before you apply them, and you can make deploying schema updates to a database part of the build process.

Microsoft is in the process of updating its database development tools generally . Sneath noted that SQL Server 2016 moves its tooling into more recent versions of Visual Studio. “SQL Server 2012 was using the Visual Studio 2010 tooling; now it’s based on Visual Studio 2015.”

For enterprise developers, the strong DevOps integration between Visual Studio and SQL will be useful, but will Microsoft take that further?

NoSQL support in Visual Studio is still based on extensions rather than being a built-in feature, but that is the way Microsoft approaches new Visual Studio features, Sneath noted. “Our philosophy is to start with great extensions. The Docker support started as an extension in Visual Studio 2015 and now it’s a core part of the product, and the same is true of other features. Where things are moving very fast, where there’s a lot of rapid experimentation you’ll see us continue to ship extensions that then make their way into the core product.”

Feature image: “Sculptor’s Studio” by Louis Moeller, from the New York Metropolitan Museum of Art, public domain.

The post Visual Studio 2017 Offers Live Unit Testing, and DevOps for the Database appeared first on The New Stack.

↧

Beyond Bash: Microsoft Refines the Windows Subsystem for Linux

March 27, 2017, 1:00 am

≫ Next: Microsoft Mulls Expanding Windows 10 to Support Multiple Linux Distributions

≪ Previous: Visual Studio 2017 Offers Live Unit Testing, and DevOps for the Database

Microsoft’s newest update to Windows 10, called The Creators Update, will contain Windows Subsystem for Linux, a tool that could make Windows 10 much more appealing to the increasing number of developers considering a move from Mac OS because they find the MacBook Pro underpowered for their needs.

WSL is often called ‘Bash on Windows’ because Bash is the entry point, and “a Bash-like experience” was one of the original goals, Microsoft’s Rich Turner told The New Stack. He’s responsible for the Windows console as well as WSL, but WSL goes far beyond having Bash as an alternative shell on Windows. Bash is just a starting point to unlock all the tools of the Linux command line.

WSL, he explained, is “a Linux-compatible environment that looks and behaves just like Linux, and allows you to run all your Linux code — your Linux build system, your GNU tools and everything else you need to run, build and test your application without having to fire up VMs.” WSL still uses the Windows kernel; it just uses it to run the system calls ELF64 Linux binaries depend on, via a pico driver with Microsoft’s clean room implementation of the Linux syscall interface.

Microsoft placed WSL in Windows largely for developers, because of the way so many open source tools and languages and libraries assume developers are using Linux. “Many of these have hard dependencies on Linux behaviors, the Linux file system layer, Linux networking socket interaction mechanisms and so on, which made some of those things struggle to work well on Windows, because Windows has a slightly different way of doing a lot of those things.”

Core languages and runtimes like node and Python and Ruby work well enough on Windows. But if you want access to the same gems, packages, libraries and modules you’d use on Linux, so you can use the exact same toolchain, you need more fundamental compatibility than just porting some of X Windows or the GNU libraries to Windows the way Cygwin and MSYS do, because that doesn’t help you when it comes to binary packages.

“A lot of Ruby gems are compiled and people take a dependency on those compiled gems and then they run into a problem on Windows. A lot of gems expect files to be in a particular location, and on Windows that looks completely different. We need the ability to load and run binaries from Linux without modification,” Turner said.

The idea isn’t to turn Windows into Linux, but to put the Linux tools that developers depend on alongside Windows tools like Visual Studio (and productivity applications like Office).

It’s also not about giving up on the idea of the Linux desktop, Dustin Kirkland, Canonical’s technical lead for Microsoft Ubuntu development told the New Stack. “I see it as a beautiful way to introduce the UNIX and Linux way of communicating with a computer through a command line, as a gateway to tens of thousands of open source tools. The opportunity to deliver the Linux way and open source way to literally billions of Windows users is too good to pass up.”

His own lightbulb moment came when using Visual Studio to build the Ubuntu image for WSL when he had to change one specific term across some 17 different files. Instead of hunting for an unfamiliar GUI command, he realized he could use recursive grep and sed against the project in his Documents folder in Windows, the way he would on Ubuntu.

“The two really worked beautifully together. I’m far more comfortable in vi than in any graphical editor, so being able to pop down to a vi window and create and edit files and do it natively, do ssh natively on the system, is super powerful,” he said.

Daily Driver

If you tried WSL early on and were disappointed, it’s time for another look. The version of WSL in the Windows 10 Anniversary Update was an early release to get developer feedback about what tools they needed WSL to run. “It was a snapshot into where we’re going,” Turner emphasized, noting it had obvious gaps.

“You couldn’t ping, you couldn’t look at your ifconfig to see how your network was configured. We couldn’t run Java, we couldn’t run npm because it wasn’t able to enumerate the network configuration. Those all work now, and we can run MySQL and Postgres and Apache and Nginx and node and Ruby and Java and Python, and even Core CLR works now for ASP.NET.”

In Creator’s Update, key tools like SSH and the GDB GNU debugger work more reliably. And you can even configure WSL as the target for Visual C++ for Linux in Visual Studio, so you can edit and debug visually and then compile and build in WSL.

Making those tools work has meant “adding huge numbers of newer capabilities either in new syscalls or by expanding the breadth and depth of our syscall implementations, allowing more tools and libraries to run,” he said. Adding a new syscall or a new capability to an existing syscall often fixes 20 or 30 issues, making for fast improvements.

Kirkland calls the progress outstanding. “We’ve seen the Windows kernel team filling out even more of WSL, capturing more system calls, and ensuring that anything and everything you would expect to work in Linux continues to work. We see things like Screen and Tmux and Byobu now working very well. We’re starting to see bits and pieces that provide the initialization procedures so we can start doing things in the not too distant future, like perhaps containers.”

Canonical maintains the user mode images that Windows 10 systems downloads when you first run Bash. So far those have been Ubuntu 14.04, with a new image about every three months (that’s the frequency Microsoft suggested Windows users would be comfortable with, Kirkland noted). With Creators Update, that switches to Ubuntu 16.04 (Xenial, the version released last April.)

That’s a big improvement, because it means much newer versions of commonly-used libraries, compilers and utilities, all natively packaged and one packaged app install away.

Ubuntu 16.04 will be installed by default if the Creators Update is the first time you’re using WSL on a PC. If you already have 14.04, Windows won’t update your distro. Turner says that’s because of the very strong feedback from the community that they didn’t want the update to be automatic). You can do an in-place upgrade using `sudo apt dist-upgrade`, if you want a clean 16.04 instance, use `lxrun /uninstall /full` to remove your Ubuntu instance and then reinstall it with `lxrun install.`

Creators Update also fixes some simple annoyances, like mouse support in the console (as well as adding 24-bit color). It also integrates the Windows and WSL environments more closely. “You can run a command on the Ubuntu system that affects the Windows system,” Kirkland explained; “so you can edit files in real time and have the file updated in both Notepad and vi, or you can launch an app from Linux that triggers an event in Windows — or vice versa.”

That integration also means you can see Linux processes in the Windows task manager. “Supporting Tmux allows you to have multiple panes and each pane is running its own Bash, so if you look in task manager you’ll see multiple instances of Bash, the main one that’s a child of init, and instances for each of your tiles,” Turner explained. “If you run MySQL, you’ll see MySQL in task manager.”

That gives you a handy way of dealing with runaway Linux processes or error-prone Bash scripts. You can just right-click on them in task manager and kill the process. Network and system monitoring tools can also see WSL processes, because they’re exposed to the Windows Management Interface and they use the Windows networking stack and the Windows firewall. Enterprises may want this so that they can use their existing security monitoring tools with WSL processes as well as Windows ones.

That also fixes a big management problem for businesses whose developers might be using dozens of VMs that can’t be monitored the way Windows tools are. “They can get developers off Hyper-V and VMware VMs which bypass most of the Windows network stack and talk directly to the network card,” Turner explained.

Sticking with the Command Line

Now that the WSL platform covers the majority of mainstream developer scenarios, the emphasis is shifting to what Turner called the more esoteric and edge cases, as well as more developer requests, and there will continue to be more updates and improvements.

One thing you shouldn’t expect WSL to do — officially — is support the Linux desktop. Users have been experimenting this, running everything from the Ubuntu desktop to Firefox. The fact that they work is purely a by-product of making WSL compatible with the developer tools it’s designed for, Turner said.

“We’ve been very clear that the reason we are building WSL is to provide an environment for developers to get their work done. There are some Linux GUI tools that used by developers but by and large, the majority of tools they want to be able to run are compilers and debuggers and build engines and so on. We are only focusing our efforts on command line tools and scenarios.”

Microsoft isn’t discouraging those experimenters, though. “It’s been so much fun for us to watch,” he said, “and we’re not doing anything to prevent it, but it’s not something we’re focusing our efforts on.”

Feature image: The Windows kernel handles Linux system calls for the Windows Subsystem for Linux using a pico process (Microsoft).

The post Beyond Bash: Microsoft Refines the Windows Subsystem for Linux appeared first on The New Stack.

↧

Microsoft Mulls Expanding Windows 10 to Support Multiple Linux Distributions

April 11, 2017, 1:00 am

≫ Next: Why TypeScript Is Growing More Popular

≪ Previous: Beyond Bash: Microsoft Refines the Windows Subsystem for Linux

Microsoft’s latest update of Windows 10, called Creator’s Update, marks a new level of stability and support for the Windows Subsystem for Linux (WSL), the way to run Linux binaries on Windows 10. It also brings the promise of expanding beyond the current Ubuntu distro to support multiple Linux distributions in the future.

Microsoft’s Rich Turner told The New Stack that with the Creators Update, WSL is good enough to be a “daily driver” for developers who need to use Linux command line tools alongside Windows GUI tools. “We’re getting to the point where the maturity of the underlying platform is getting reasonably good, at least in terms of mainstream developer scenarios.”

But the WSL team still has plenty of improvements, some of which need changes in the underlying Windows systems.

“We know that file system performance needs to be improved,” Turner said. “We want to increase our disk I/O throughput; our disk IO performance right now is not where we want it to be. In almost every other aspect, anything to do with process or memory throughput, we’re actually as fast as Linux if not a little bit faster on the same hardware,” he claimed. “On network I/O we’re looking really good but we’ve got some extra network socket modes that we need to support. There are a couple of esoteric network tools that need particular types of socket support we can’t currently do, so we’re working with the Windows networking team to add those. We’re working with the storage and NTFS team to provide us some extra hooks so that we can make our disk storage and throughput more efficient.”

Those changes will take some time, he noted. “We will address [these areas] in the next version of Windows and then some more the version after that. The file system changes, in particular, are things that we have to take a great deal of care with, so we’re going to take our time with those, to make sure they’re done properly.”

Another area that’s frustrating for developers but also needs to be approached carefully to make sure the security model is correct is network file access. “People want to be able to mount NFS drives and ssh connections and SAMBA drives to be able to connect up to a Windows network share, for example. We don’t support arbitrary mounting right now. We kind of mount your local hard drive, but that’s ‘mount.’ We want to be able to support eventually the real mount capability within Linux itself, but we’ve got engineering and code to write to make that happen.”

WSL will also get better support for working with devices. “Developers want to build an Android build and deploy it to their phone, or they want to deploy to a Raspberry Pi module because they’re building an IoT device. Right now, we don’t support USB devices or serial [connections] but we want to be able to eventually add that as well.”

WSL doesn’t currently run on Windows Server, but that’s another possibility (again, it needs work).

Some requests would need more major changes, like being able to run persistent, background daemons and services, to make cron more useful. Currently, you can run any background services you want — but they only run as long as the bash console is open.

“Until you run Bash, no Linux process can run on your machine. As soon as you open Bash, you can choose to start any background services you want to have run. If you want to have MySQL or FSH or ssh or Postgres or Apache or whatever run, you can start them manually or autostart then with the .bashrc file,” Turner pointed out. “But as soon as you close the Bash console, we tear down any running Linux processes. So if you close the console window, you can no longer access your system via ssh, from your machine or any other.”

That’s something developers would find useful, Dustin Kirkland, Canonical’s technical lead for Microsoft Ubuntu development told the New Stack. “Users come to me from time to time and ask about persistent daemons. It’s one thing to have ssh working while the terminal is open, but it’s another thing if it’s always running.” After all, “if you’re running a Linux virtual machine with a network mapped, you can ssh into the VM as long as it’s running.

Ideally, says Turner, background services would integrate with Windows the way processes now do. “What we want to be able to do in future is figure out a decent way to have Linux processes running in the background as daemons or as services, and to be able to have those have auto-start on the machine if you want, in a manner similar to Windows services.”

Canonical would also find persistent cron useful for updating the Ubuntu image WSL uses. “On a traditional Ubuntu system, a cron job runs once a day, typically at night, that does the apt update and gets the list of packages available, and the updates that are security critical are automatically applied,” Kirkland pointed out.

That doesn’t happen with WSL, but Windows does periodically get information about available updates, Turner explained. “Right now, we have a background task that runs every five to eight days that periodically pings the apt cache and downloads the apt package index, so when you start a new session we can say what packages do you have installed and what the latest versions are and show you a message saying ‘hey, 43 of your packages are a bit behind, you might want to update.’ We don’t auto-update your distro. We will auto-update WSL itself, so we will patch the underlying implementations of the syscalls, we will patch our user mode tools, like the console itself and we’ll patch the install mechanism if necessary, but we don’t touch the internals of your Linux distro.”

“For Canonical to auto-update your system, they need the cron daemon running in the background. Once we’ve implemented that background mechanism, things like that can take place.”

Auto-updating isn’t uncontroversial. “Some customers are vehemently of the opinion that there should be no auto-update ever,” he noted. “But some, especially enterprise customers are vehement that all Linux instances must be updated to a certain patch level so that if a vulnerability is discovered in a distro, it is automatically patched and doesn’t become a problem. We want you to have both of those options, but we have to figure out the background process story to enable that.”

Kirkland also pointed out the way networking works in WSL. “Right now, the network space is completely flat, so the Windows desktop and the Linux shell are sharing the same IP address and the same set of ports. If you tried to use the same port on both of them, you’d end up with a port conflict. Microsoft has some protections in place for that but it’s an interesting consideration. Normally when you think about running one OS on top of another, each OS gets its own network space, it’s own IP address; that can be NAT’ed or bridge. But trying to collapse that TCP stack into a single network space is interesting. The team has made some good safe decisions from that perspective but it’s a complicated one to think about.”

The priorities for all these improvements depend both on community interest and what they add to WSL. “It all gets put on the stack and prioritized and we knock those things off in priority order based on how frequently we hear the request being made and how much impact we think the feature will have in the community,” Turner explained.

Some requests will need a lot more work. Support for Linux containers (as opposed to the Docker support for Windows containers that’s built into Windows 10 and Windows Server 2016) was one of the first requests Microsoft had for WSL. There are some high-level architectural similarities between WSL and what container based systems do that prompted those questions, but there are also some key differences.

“Implementing all the underlying kernel infrastructure that would be required to support containers and namespaces requires some work. It is on the backlog and at some point in the future, hopefully, we will be able to consider what it would take,” said Turner. “It’s something we do hear on a regular basis and we are interested in doing, but we have to figure how to find the time.”

Beyond Ubuntu

Officially, WSL only supports Ubuntu at the moment. Unofficially, WSL users have already been experimenting with running distros like SUSE; Microsoft has learned a lot from that work because it shows which syscalls need to be added or extended.

When work first started on WSL, Ubuntu was chosen because it was the most popular distro with the developers the team consulted. Kirkland puts that down to Ubuntu’s regular update policy for a wide range of Linux tools. “Developers using Ubuntu choose us because we provide the latest and greatest open source software on a very timely and predictable schedule. We do dozens of bug fixes and security updates every day in any one of the 25,000 open source projects which are built into 55,000 binaries that are now just one single app install away.”

Since WSL came out, though, developers have also asked for other distros like Alpine and ArchLinux and RedHat and SUSE, and more specialized distros too, Turner said. “We want this to be a distro-agnostic platform on which developers can be unblocked on doing what they need to do, so they can build the Linux portions of their code alongside their Windows code, and alongside UWP apps, and alongside everything else they need to be able to ship their mobile desktop and cloud solutions, regardless of what technology those solutions are targeting.”

“As far as WSL is concerned, we don’t even know what distro you’re running, it doesn’t know what apps are running on top of it; it just does the work that’s asked of it,” he points out. “Essentially, WSL is just a piece of kernel infrastructure that provides a layer that is compatible with the Linux kernel system call interface. When something calls it and says ‘open a file’ or ‘read from a file’ or ‘open a network socket’ or ‘allocate me some memory,’ it just allocates the memory and hands it back.”

Supporting other distros will need changes to the user mode tooling that lets you install and uninstall your distro, “because you don’t want to have to nuke an instance to replace it with another.” But first WSL needs to implement all the syscalls that the tools in all the different distros depend on.

“Each of those flavors of Linux has different idiosyncrasies; they have different installers, they have different file system layouts, different configuration systems for some tools… if you’re running a Red Hat, it might be using a different installer to Ubuntu and the installer for Red Hat might make use of a syscall that Ubuntu doesn’t, so it is one we’re missing.”

That means trying out distros to see what syscalls are missing — or what extra capabilities are needed in syscalls WSL already have.

“That will allow us to start supporting a broader set of tools and we will eventually get to the point where we feel comfortable that we can support a Red Hat or a SUSE or a CentOS or one of the other thousands of distros.”

Microsoft wants those distros to be formally supported on WSL, but not by having the distros do extra work for Windows. “What we intend to is provide enough infrastructure under the hood that those other distros can run on top without us having to do anything,” said Turner.

Kirkland welcomes that; not just because Linux is all about choice, but because it will “improve the overall quality and testing of the system calls WSL provides”.

At that point, WSL might be able to run not just more than one Linux distro — but even more than one Linux distro on the same PC. It would take a lot of work to make it happen, but it’s a common request and Turner can see the attraction for developers.

“Some people work on systems that span internal and cloud-based environments. You might have an internal system that manages information coming from customers, that’s in MySQL with a Redis cache; a lot of enterprises use RedHat and SUSE for that. But then when you have the cloud front end to that system it might be housed in AWS running on top of Ubuntu. If I have to work on things in my Ubuntu cloud interface and I also have to work on things in my RedHat or SUSE backend environment, then I need both environments to get my work done; I couldn’t do it all in Ubuntu.”

Today that would mean using at least one virtual machine, which means paying for a cloud service or having powerful enough hardware to run multiple VMs. Ironically, because WSL is a layer in Windows rather than a distro itself, it opens up new possibilities for creating an extremely flexible platform for developers that Linux itself can’t easily deliver.

Feature image by Stefan Kunze via Unsplash.

The post Microsoft Mulls Expanding Windows 10 to Support Multiple Linux Distributions appeared first on The New Stack.

↧

Why TypeScript Is Growing More Popular

April 17, 2017, 1:00 am

≫ Next: Microsoft Puts AI Where the Data Is

≪ Previous: Microsoft Mulls Expanding Windows 10 to Support Multiple Linux Distributions

Why is TypeScript getting so popular? Key development frameworks depend on it and it improves developer productivity in the ever-changing JavaScript world.

The recent Stack Overflow Developer Survey and the annual RedMonk programming language rankings both showed that TypeScript — the open source project started by Microsoft to combine transpiling for advanced JavaScript features with static type checking and tooling — is reaching new heights of popularity. By providing minimal checking syntax on top of JavaScript, TypeScript allows developers to type check their code, which can reveal bugs and generally improve the organization and documentation of large JavaScript code bases.

Nine and a half percent of the developers Stack Overflow surveyed are using TypeScript, making it the ninth most popular language, just ahead of Ruby and twice as popular with that audience as Perl. Stack Overflow reaches a diverse audience in this survey; the top two languages used are JavaScript and SQL, so this survey isn’t just querying front end development. In fact, TypeScript coders show up in all four of the job roles Stack Overflow asks about; web developers, desktop developers, admins and DevOps, and data scientists.

RedMonk’s rankings combine the Stack Overflow numbers with GitHub pull requests to find out what developers are thinking about, as well as what they’re using. TypeScript has also gained popularity with this audience of developers, moving from 26 to 17 in the rankings. Some of that is down to interest on Stack Overflow, but mostly it’s because of the increased developer involvement on GitHub.

Indeed, GitHub’s own 2016 State of the Octoverse puts TypeScript as the 15^th most popular of the 316 programming languages developers use for projects on GitHub (based on both the number of pull requests and the 250 percent increase in pull requests for TypeScript over the previous year).

TypeScript also has both the highest usage (21 percent) and the highest interest among those not yet using it (39 percent) among the various “alternative” JavaScript flavors in another survey of developers. The methodology of this survey is unusual — it rather strangely conflates transpilers with package managers like npm and Bower — but the developers who responded to the survey and use TypeScript also commonly use ECMAScript 2015, NativeScript, Angular, and especially Angular2.

RedMonk’s Stephen O’Grady notes that “it seems reasonable to suspect that Angular is playing a role” in the increasing popularity of TypeScript. Angular2 is just one of the projects that has adopted TypeScript though (Asana and Dojo already used it, as do internal projects at Adobe, Google, Palantir, SitePen and eBay). But it might be the best known — with Google employees like Rob Wormald [@robwormald] evangelizing TypeScript alongside Angular.

Not Just Angular2

“There’s no doubt the partnership that we have with the Angular team has helped drive the numbers,” core TypeScript developer Anders Hejlsberg told The New Stack. “That goes without saying; but even so, I think the real point is that it was a massive vote of confidence on the part of an important industry force.”

That vote of confidence is broader than just Angular, he pointed out. “Lots of other frameworks are using TypeScript at this point. Aurelia, Ionic, NativeScript are all, in one way or another, involved in TypeScript. The Ember framework, the Glimmer framework that was just released is written in TypeScript.”

“We’re seeing a pretty large vote of confidence by a lot of people who have a lot of experience in this industry and I think that’s probably what everyone at large is noticing,” — Anders Hejlsberg

That vote of confidence brings framework users on board too. “We’ve done a lot of work to be a really great citizen in the React ecosystem. We support JSX, we support all the advanced type system features that you want like refactoring and code navigation on JSX markup. We’re also now working with the Vue.js community to provide better support for the patterns used in the framework,” Hejlsberg said.

Adding support for new frameworks is an important part of staying popular with developers. “We’re always on the lookout when it comes to frameworks. We understand that this a very dynamic ecosystem. It changes a lot; you’ve got to stay on your toes and work well with everything.”

The same is true for the tooling pipeline, especially as ECMAScript modules become more popular. “A lot of people writing modern style JavaScript apps use modules, and when you’re using ECMAScript 6 modules you need a bundler to bundle up your code so it can run in a browser, like Webpack or Rollup.js. We make sure to work well with those tools so we fit into the whole pipeline,” Hejlsberg said.

React is a library with Facebook roots. Angular is a Google-spawned framework. There is abundant analysis comparing them, and in general, it shows that Angular trails, with Vue.js getting significant buzz. Angular has seen a lot of uptake among TypeScript fans, with 41 percent prioritizing 2.x and another 18 percent favoring the older version. With the recent release of Angular 4 and TypeScript’s growing popularity, we expect the JavaScript wars to continue (Lawrence Hecht).

There’s also been the same steady growth in the number of libraries with TypeScript definitions. DefinitelyTyped, a repository for TypeScript typed definitions, now has over 3,000 frameworks and libraries. That’s accelerated by automatically scraping and publishing declaration files as npm packages under the @type namespace.

“That means there’s now a very predictable way of discovering what framework have types – and we can auto provision the types. When we see you’re importing a particular framework we can go find types for you so you don’t have to do it anymore.” In fact, Hejlsberg claimed, “for some developers, that’s becoming a decision factor when they pick a framework; whether they can work with a framework and get types.”

“Often the way TypeScript ends up being adopted — in enterprises and start-ups and individual developers — is that you try it on one project and you say ‘wow, this is great!’ and then you start evangelizing and it grows locally in your sphere of influence.”— Anders Hejlsberg

The general rise in interest seems to be one of organic growth. “We don’t do any advertising whatsoever, this is all driven by the community. It’s actually steady growth and we’re just starting to notice the larger numbers now,” Hejlsberg said.

Hejlsberg notes that TypeScript is also the third most loved language in the Stack Overflow survey after Rust and Smalltalk (and just ahead of Swift and go) and the sixth most wanted language, head of both C# and Swift. “I think that speaks a lot to the fact that we’re actually solving real problems,” Hejlsberg said.

Microsoft’s Sphere of Influence

It’s easy to view the success of TypeScript as Microsoft bringing enterprise developers who are already in the Microsoft world to JavaScript via familiar tools.

“We obviously have a large developer ecosystem already with C# and C++ and Visual Basic. Lots of enterprises use Microsoft tooling and they also have front ends, and when we start improving the world on the front end side, they sit up and take notice and start using that,” Hejlsberg admitted.

But while a lot of TypeScript development is done in Visual Studio, just as much is done in Visual Studio Code, Microsoft’s open source, cross-platform IDE. “That’s a community we increasingly did not have all that much of a connection to. For Visual Studio Code, half of our users are not on Windows, so all of a sudden we’re having a conversation with a developer community that we did not really converse much with previously.”

Open Source and on the Fast Track

The TypeScript team recently announced that releases will now happen every two months rather than quarterly, which Heljsberg called an attempt to make release dates more predictable, rather than holding up a new release to get a particular feature in. That’s the same approach that the ECMAScript committee is taking.

The new release cadence for TypeScript is also aligned with the Visual Studio Code schedule; partly because Visual Studio Code is actually written in TypeScript, but also because tooling is a key part of the appeal of TypeScript.

While it’s important that TypeScript supports multiple editors and IDEs, Hejlsberg noted that Visual Studio Code is another factor helping with the popularity of the language.

In fact, you get better coding features because of TypeScript, even if you only write in JavaScript, he explained. “Visual Studio Code and Visual Studio both use the TypeScript language service as their language service for JavaScript. Since TypeScript a superset of JavaScript, that means JavaScript is a subset of TypeScript it’s really just TypeScript without type annotations,” he noted.

In Visual Studio Code, opening a JavaScript file will trigger a TypeScript parser, scanner, lexer and type analyzer to provide statement completion and code navigation in the JavaScript code. “Even though there are no type annotations, we can infer an awful lot about a project structure just from the modules you’re using and the classes you’re declaring,” Hejlsberg said. “We can go and auto-provision type information for the framework you’re importing then we can give you excellent statement completion in JavaScript, which actually surprises the heck out of people.”

What makes this fast cadence possible are the tests required for pull requests to be accepted, guaranteeing the quality of the master branch, and the popularity of TypeScript, which means any problems are found quickly.

“We’re an open source project, we do a lot of work on GitHub. And we never take pull requests unless they pass all the 55,000 tests that we have, and unless they come with new tests if you’re implementing a new feature, or regressions test if it is fixing a bug. That means our master branch is always in very good shape,” he said.

JavaScript: Powerful but Complex

More than any single factor, what might really be behind the increasing popularity of TypeScript is how complex JavaScript development has become, and also how powerful it can be.

“Our industry and our usage of JavaScript has changed dramatically,” Hejlsberg pointed out. “It used to be that we lived in a homogenous world where everyone was running Windows and using a browser, and that was how you got JavaScript. Now the world has become very heterogeneous. There are all sorts of different devices — phones and tablets, and running JavaScript on the backend with node, and JavaScript has jumped out of the browser using things like NativeScript or React Native or Cordova that allows you to build native apps using JavaScript.”

“Yes it’s more complicated but it’s also infinitely more capable,” Hejlsberg said of JavaScript. “You can reach so many different application profiles with JavaScript, with a single language and toolset. To me, that’s what is fueling all of this: The incredible breadth of the kinds of apps you can build, and the kinds of reusability and leverage you can get in this evolving ecosystem. It’s not just got more complex; it’s also gotten way more capable.”

TNS analyst Lawrence Hecht contributed to this report.

Feature Image from GitHub.

The post Why TypeScript Is Growing More Popular appeared first on The New Stack.

↧

Microsoft Puts AI Where the Data Is

April 25, 2017, 1:00 am

≫ Next: Microsoft Brings Container Orchestration to Azure Service Fabric, for Windows and (Soon) Linux

≪ Previous: Why TypeScript Is Growing More Popular

If you want to do machine learning, you need data to do it with. So far, however, the complexity of machine learning tools has usually meant doing development with a framework like TensorFlow, the Microsoft Cognitive Toolkit, using R and Python and specialist statistical tools, or using cloud APIs to machine learning services.

Any of these approaches requires getting the data out of a database, and then integrating the output of the machine learning system with the applications. Those transforms and transfers and integrations make development and deployment more complex, slow things down, can be error prone, and discourage retraining models as frequently as you might want (to avoid ‘ML rot’).

With the second Community Technology Preview of SQL Server 2017 relational database management system (RDMS), Microsoft is adding in-database machine learning functions as stored procedures, plus support for Python as well as R. SQL R Services, now called SQL Machine Learning Services, and this interface also lets you reach out to GPU-powered analytics, data processing and machine learning tools like deep learning frameworks.

“You never have to take your data out of the database, so you have all the security and audit tools that you’re used to,” Joseph Sirosh, corporate vice president of the Microsoft data group, explained to The New Stack. He called SQL Server 2017 “The first commercial, transactional RDBMS that supports AI, that supports intelligence in the database. The manageability is huge part of the value that a database brings and now you have an intelligence management system.”

Speed and Security

Although Microsoft has an increasing range of machine learning services (Cognitive Services now includes 25 APIs), you can connect a wide range of machine learning tools to SQL Server 2017. “The speed of innovation in artificial intelligence is in open source,” said Sirosh. A query could use Python or R code to invoke a GPU-powered library to transform data, then run deep learning on that transformed data, and get a deep-learned prediction.

“Intelligent solutions will not be restricted to any one company, they’re going to be democratized the way all computing has been and they’re going to be created by mainstream developers,” Sirosh said. “If you think about what developers need to create their intelligence revolution, we think they need simplification of intelligence and availability of intelligence in the platforms they use.”

“When you locate the algorithms right next to the data in the data platform, you don’t have to slosh data around; the algorithm comes to the data and it runs dramatically faster because it’s running in place.” — Joseph Sirosh.

Putting intelligence in the database is about performance, but it’s also about manageability and developer productivity, he said. “The data we learn from is massive; you can’t move that around networks without incredible slowdown. When you locate the algorithms right next to the data in the data platform, you don’t have to slosh data around; the algorithm comes to the data and it runs dramatically faster because it’s running in place.”

Compliance and security are another good reason to keep the data you learn from in the database; after all, it’s often data about your customers that will damage your brand and your bottom line if it leaks. “Databases provide high availability, access control, security, encryption; you can take advantage of that. You can train deep learning models with data that resides in the database and you can deploy them in the database itself and you’ve never take data out of the database.”

Because the intelligence features in SQL Server are treated like any stored procedure, users can also take advantage of the other SQL Server built-in security and access controls like hiding rows and columns users don’t have the rights to see. “You can learn with an identity that has access to all the data but when you deploy [your intelligent app], the identity using it might not have access to the privacy-sensitive piece of data because those can be masked off,” Sirosh pointed out. You could even create data simulations and what-if scenarios in SQL Server if you have a limited training set that you need to bulk up.

Where the Machine Learning Meets the Apps

Once your machine learning system is trained, you need to operationalize it. Often that means rewriting R code in another language like JavaScript so you can run it in a web server, as well as provisioning the system to run it; another inefficient and time-consuming step.

If you need to draw data from multiple sources like Hadoop, SQL Server already includes the Polybase technology, which dramatically simplifies querying Hadoop. Rather than computing joins between the different data sources and setting up a highly available system, which is complicated to build, analyse and scale out as usage grows, administrators can use the to execute the heavy lifting,” given that “they have support for concurrency, for joins, for data management, as well as security,” Sirosh said.

SQL Server 2017 offers a built-in platform to serve machine learning models from, with monitoring and performance tools, and the usual database development tools, like SQL Server Management Studio and the SQL Server Data Tools, as well as Visual Studio. You can expect these tools to be better integrated in future, Sirosh suggested, as well as for more machine learning models to be built in. “In the future, we want to make AI functions into simple SQL functions, like the ‘analyze faces’ function in SQL Azure.”

Microsoft is applying the same principles of putting data and intelligence tools in the same place to R Server and its Azure cloud data services. Azure Data Lake Analytics lets you run U-SQL, R, Python and .NET code against petabyte-scale databases and U-SQL includes a number of the APIs from Cognitive Services as functions you can call.

If you’re storing data in Microsoft’s globally distributed NoSQL service, DocumentDB, that now integrates with Spark so you can run machine learning on that data. And Microsoft R Server 9.1 includes several machine learning algorithms from Microsoft, plus pre-trained neural network models for sentiment analysis and image recognition.

But for many enterprises, SQL Server is still where their data lives and it’s what drives the apps that create and use that data. The history of SQL Server — much like the history of Windows Server — is Microsoft taking features like high availability, data analytics and business intelligence that were once reserved for expensive, high-end systems and bringing them to mainstream businesses at affordable prices.

While the machine learning and artificial intelligence landscape is far more complicated than the database market, if Microsoft can turn SQL Server into a platform where enterprises can work with machine learning and AI from the comfort of their own database systems, where they have familiar controls and development tools, then AI and ML may truly find a home in tomorrow’s enterprise

Feature image via Pixabay.

The post Microsoft Puts AI Where the Data Is appeared first on The New Stack.

↧

Microsoft Brings Container Orchestration to Azure Service Fabric, for Windows and (Soon) Linux

May 17, 2017, 1:00 am

≫ Next: Microsoft Draft Offers Kubernetes Support for Developers

≪ Previous: Microsoft Puts AI Where the Data Is

Microsoft is investing in containers, both with its Azure Container Service and with Azure Service Fabric, its distributed systems platform for building microservices apps.

Service Fabric is the fabric that Azure runs on, made available to developers. Azure SQL Database, Skype for Business, Service Bus, Event Hubs, Cosmos DB, Intune and other Microsoft cloud services run on Service Fabric. Users can deploy the cloud version in Azure (and soon other clouds) or put the runtime on their own Windows and (soon) Linux servers.

Azure Service Fabric takes care of handling resiliency by placing different microservices in different servers in different racks; it handles rolling updates, with an automatic rollback to known good states if there are problems, and has health monitoring for failures in VMs, in services and microservices, plus it supports both stateless and stateful microservices.

Originally pitched to developers as a PaaS for building cloud services, Service Fabric is also being used by organizations to “lift and shift” existing applications to the cloud, sometimes in conjunction with Visual Studio Team Services, Microsoft’s continuous integration and deployment service. Service Fabric has a naming service for container endpoints and DNS service for inter-container communications. Apps can be debugged remotely with Visual Studio and monitored with Operations Management Suite.

Last week, the company released a new version of Azure Service Fabric (version 5.6) would handle container orchestration duties for Windows Server Containers. And later this year, you’ll be able to use Service Fabric as a container orchestrator on Linux and there are more container orchestration features planned.

Brendan Burns, who runs the Microsoft ACS team, views Service Fabric as “a container orchestrator just like Kubernetes or anything else; it also has a richer environment for building applications. It combines the ability to deploy containers with a richer programming environment.”

Microsoft just acquired Deis for its Kubernetes expertise. And it’s also adding Docker Compose support to which already has orchestration for in Windows Server Containers.

ACS is Microsoft’s (increasingly heterogeneous) container hosting service that you can build container-based systems on. ACS runs container orchestrators for you, and you use the same open source APIs and the same ecosystem of tools you’d use to work with those orchestrators anywhere else, without having to worry about the complexity of building and maintain the orchestrator infrastructure.

ACS also supports the new managed NoSQL offerings on Azure, PostgresSQL and MySQL as a Service; these are high-availability services with data protection, recovery and elastic scaling. That’s Microsoft meeting developers where they are, Microsoft general manager for database systems Rohan Kumar told us; “There are developers who want to build on MySQL. This uses the same fabric on which we’ve built our other relational databases, it has built-in high availability, scaling up and down without any downtime.”

According to Burns, ACS Deis’ Steward and other service brokers will be supported for the managed NoSQL services soon, so that if you want your Kubernetes application to use a PostgresSQL or MySQL database, you’ll be able to browse a service catalog, pick the keyboard you want and have the service broker provision it and bind it to your application, all through ACS.

Scenarios and Services

With so many container features coming to Service Fabric, when should you be considering it and when would you use ACS? Or how about Azure Batch, where you can use Docker to package and deploy jobs?

For the Azure Service Fabric, you can use low-level primitives, or higher-level abstractions like key-value stores and queues, or even the higher-level actor model (based on the same research that was first developed as Project Orleans. As Azure Chief Technology Officer Mark Russinovich put it when Service Fabric was first announced, “This really democratizes stateful distributed programming — which is the hardest kind of programming there is.”

Increasingly, you can use containers almost every Azure service, so that you can choose the right service — not just the one that supports containers, Corey Sanders, Microsoft head of product for Azure Compute told The New Stack.

“One of the benefits of having containers everywhere is that you don’t need to make your decision on which platform to use based on whether you’re using containers or not. For web apps, if you want to deploy a solution that’s got CI and CD integration, that’s got pre-warming, that’s got all the bells and whistles that come with app services as a fully managed platform offering, but you want to use containers — great, we have that support [in Service Fabric]. Or if you want to run a job-based computing solution and not have to manage the infrastructure and just run jobs across a set of virtual machines, but you want to use containers to host your jobs — we’ve got that. That’s Azure Batch. The idea being that when you approach the platform, you approach with a scenario [in mind], not just with ‘I want to use containers’.”

Containers are rapidly becoming the de facto way of encapsulating applications, which is why Docker and the rest of the container ecosystem are showing up in Microsoft’s products and services, from Windows Server to SQL Server 2017 to everywhere in Azure, Sanders explained. Rather than a destination, containers are another tool.

“The combination gives customers a lot of flexibility to pick [a service] based on their interest, their scenario and their programming models. Containers supported everywhere on Azure is such a great step forward.”

Feature image via Pixabay.

The post Microsoft Brings Container Orchestration to Azure Service Fabric, for Windows and (Soon) Linux appeared first on The New Stack.

↧

Microsoft Draft Offers Kubernetes Support for Developers

May 31, 2017, 8:26 am

≫ Next: Cosmos DB: Microsoft Azure’s All-in-One Distributed Database Service

≪ Previous: Microsoft Brings Container Orchestration to Azure Service Fabric, for Windows and (Soon) Linux

Containers make it easier to deploy applications with all their libraries and dependencies, though in many cases organizations do have to change their workflow to accommodate the new technology. That can cause adoption of container technology to stall inside organizations when the change is driven by operations, noted Gabe Monroy, the Microsoft’s lead program manager for containers on Azure.

This is a problem Monroy believes a new open-source Kubernetes deployment tool called Draft can fix. Monroy is the former chief technology officer for Deis, a company that Microsoft is in the process of acquiring. The company unveiled this technology at the CoreOS user conference, being held this week in San Francisco.

“Draft is solving what I think is the number one problem facing organizations that are trying to adopt containers at scale. When the operations and IT teams in a company have bought into the idea of containers and they stand up Kubernetes clusters and have some initial wins, they turn around ready to unleash this to a team of a thousand Java developers — and the reaction they get is like deer in the headlights. It’s too complicated, it’s too much conceptual overhead; this is just too hard for us.

In other words, operations teams need to make Kubernetes easier and more palatable for software teams.

Draft reduces that conceptual overhead by taking away most of the requirements for using Kubernetes; developers don’t even need to have Docker or Kubernetes on their laptop — just the Draft binary.

“You start writing your app in any language, like Node.js, you start scaffolding it and when you’re ready to see if it can run in the Kubernetes environment, in the sandbox, you just type ‘draft create’ and the tool detects what the language is and it writes out the Dockerfile and the Helm chart into the source tree. So it scaffolds out and containerizes the app for you.”

Helm is the Kubernetes package manager developed and supported by Deis.

That language detection is based on configurable Draft “packs,” which contain a detection script, a Dockerfile and a Helm Chart. By default, Draft comes with packs for Python, Node.js, Java, Ruby, PHP and Go. Microsoft is likely to come out with more packs — TypeScript is under consideration — but Monroy expects the community to build more packs to support different languages, frameworks and runtimes.

“One of the benefits of microservices is allowing teams to pick the right language and framework for the job. Packs are extremely simple, so teams can and will customize them for their environment. Large customers want the ability to say ‘here is our Java 7 environment and our node environment that are blessed by the operations team and we don’t want developers to do anything else.’ Draft packs allow that kind of customization and control.”

The container can be a Windows or Linux Docker container; “we’re targeting all the platforms,” Monroy confirmed, and in time that will include Linux Docker containers running on Windows 10 and Windows Server 2016 directly through Hyper-V (rather than in a virtual machine).

The second new command for developers is “draft up.” This command ships the source code — including the new bits that containerize the app — to Kubernetes, remotely builds the Docker images, and deploys it into a developer sandbox using the Helm chart,” Monroy said. The developer gets a URL to visit their app and see it live.”

The Docker registry details needed for that will have been set up by the operations team as part of providing Draft to developers.

“Now you go into to your IDE, whatever IDE that is, make a change and save your code. The minute that save happens, Draft detects it and redeploys up to the Kubernetes cluster and it’s available in seconds,’ Monroy said. Those commands could easily be integrated into an IDE directly, but either way, it’s a much smaller change to a developer workflow than targeting Docker or Kubernetes directly.

Build and Test

For a developer, Monroy says this will feel rather like using platform services. “With PaaS, a developer just writes code and that code goes to the cloud. This is almost like a client-side PaaS that writes your deployment and configuration info to the repo. But because Kubernetes can model anything, you could use this to stand up Cassandra or WordPress or something that PaaS systems have a lot of trouble with. Things with stateful components can be written with Draft; it can model volumes as easily as cloud applications.”

Draft is aimed at the “innerloop” of developer workflow, said Monroy; “While developers are writing code but before they commit changes to version control. Once they’re happy with their changes, they can commit those to source control.”

Writing build and deployment configuration to the source tree makes Draft a better fit for the kind of continuous integration and development pipelines that drive DevOps than PaaS, especially when it comes to build testing.

Typically, PaaS systems have not integrated well with continuous integration and deployment pipelines where developers check code into source control and then the continuous integration system pulls it out and builds it, and tests it and stages it and then it gets rolled to production.

“Draft solves this because it drops the configuration into the source control repo and the continuous integration pipeline can pick it up from there,” Monroy said.

There are few other tools aimed at helping fit Docker into the developer workflow the way Draft does. “We see a lot of people use Docker Compose but the problem is that requires Docker on a laptop, which not every organization is willing to roll out across their entire fleet,” he noted. Docker Compose and tools like Bitnami Kompose use the Docker data model; Draft uses the Kubernetes data model, which Monroy called “much richer and much higher fidelity”.

Draft ships the entire source tree to the Kubernetes cluster and builds the containers there, which is how it gets around the need to have Docker on the developer’s system. “If you have a massive repo there could be some latency there,” warned Monroy. If that’s an issue, Draft can work equally well with a Kubernetes cluster on a laptop though, and for some organizations, it will replace even slower processes.

One large company wanting to move its 10,000-plus Java developers to Kubernetes has been using Cloud Foundry and cf push; they’re very keen to use Draft instead.

Developer Productivity

Draft takes one step towards solving an issue Azure Container Service architect Brendan Burns calls “empty orchestrator syndrome; ‘we’re totally deploying Kubernetes, we’ve deployed Kubernetes — now what?’”

“The real problem I think, is that it’s still too hard to build applications. We have a lot of the pieces but we haven’t started to actually show a way things could come together,” Burns said. Draft fits in neatly with developments like service brokers and managing secrets for applications in containers and improving developer productivity with features like remote debugging in the cloud.

Draft is only intended to be one building block in a composable system, though. “We wanted to build a tool that did one thing, one workflow, and did it well,” Monroy told us. “It does help facilitate the pipeline view of the universe, and we have other things in mind for the other parts of the workflow where a CI system picks up code.”

The Cloud Native Computing Foundation is a sponsor of The New Stack.

Feature image: CoreOS’ Alex Polvi and Microsoft’s Gabe Monroy (partially hidden) demonstrate Draft at CoreOS Fest. Photo by Alex Williams.

The post Microsoft Draft Offers Kubernetes Support for Developers appeared first on The New Stack.

↧

Cosmos DB: Microsoft Azure’s All-in-One Distributed Database Service

June 2, 2017, 3:00 am

≫ Next: Twilio’s Quest to Offer All the APIs for Modern Day Messaging

≪ Previous: Microsoft Draft Offers Kubernetes Support for Developers

Microsoft’s recently released Cosmos DB is a globally distributed NoSQL database service that lets you pick and mix your favorite data model and database APIs and still get the consistency offered by a standard transactional database.

Cosmos DB, which debuted at the Microsoft Build user conference earlier this year, is an upgrade of the Azure DocumentDB service, pressing beyond that service’s original roots as a JSON document store.

NoSQL systems often scale well but don’t have a rich query experience; database schemas have rich query options but don’t scale well. The combination of data models and APIs in Cosmos DB means you can pick a middle ground, taking away the burden of dealing with schema but not at the expense of queries.

Cosmos DB offers a globally distributed database with elastic scale, petabytes of storage, guaranteed single-digital millisecond latencies, no need for schema or index management. It can handle multiple data models and data APIs and offers multiple consistency models that give you some entirely new ways to build a distributed system.

“SQL servers were optimized for reads and queries, for the workloads of the last twenty years, but the world has changed,” said Microsoft distinguished engineer and founder of Cosmos DB Dharma Shukla. Internet of Things “devices require a rapid velocity of data; there’s lots of data being generated at a high rate and you need an engine that can sustain large, rapid writes and still serve queries.”

Cloud computing sets the stage for globally distributed apps. Many organizations, however, will distribute the front-end, but leave the back-end database in one location. With Cosmos DB, data is distributed in sync with the app.

Cosmos DB is one of the fastest growing Azure services, although Microsoft can’t name some of the largest customers using it. Microsoft is also a little cagey about how it uses the service itself, but all Microsoft billing, including all the Store transactions, go through it.

Cosmos DB was created for globally distributed mission-critical applications.

Any Schema, Any Model, Any API

Cosmos DB has a write-optimized, latch-free database engine with automatic indexing. “We can keep the database in sync at all times and we can deploy it worldwide without worrying about schema versions; because it’s schema agnostic, there is no schema version,” Shukla said. With no versioning to take care of, developers can iterate their apps rapidly.

The database engine in Cosmos DB supports multiple data models and APIs. “No data is is born relational,” Microsoft Cosmos DB architect Rimma Nehme told us; “it’s born dirty and messy, in whatever shape or structure it’s created.”

DocumentDB already supported JSON docs, key value pairs, columnar and graph data; the DocumentDB APIs include both SQL and JavaScript stored procedures, user defined functions and transactions. Cosmos DB still supports all these as well as MongoDB APIs. So local development can be done in MongoDB for code to run on Cosmos DB.

Future updates will add support for Cassandra and perhaps Amazon Web Services’ DynamoDB and other database stores, Nehme suggested. “We’re not dogmatic about APIs or data models.”

The same underlying database engine handles all these data models, Shukla told us. “Instead of hosting different database engines [for different models], we created one engine that’s schema agnostic, that’s write optimized, that can support multiple data models — and that’s API neutral.”

The same is true of graph support, which is based on Apache TinkerPop. “The graph layer is very generalized. Gremlin is one query language but we’re also going to add graph operators to DocumentDB SQL, MongoDB has graph operators so we will support those. Graph is very interesting; there are a lot of IoT scenarios. There’s a lot of momentum around graphs inside Microsoft and customers have been asking for it.”

NoSQL isn’t the only possibility; Shukla suggested that ‘significant proportions’ of the full ANSI SQL grammar could also be mapped to the Cosmos DB data model.

Cosmos DB can be used for Internet of Things-driven apps.

Choose Your Consistency

The exciting thing about Cosmos DB is that the consistency models it gives you are genuinely new. Most users choose either strong or eventual consistency; the standard models other distributed databases offer, which are the models most Cosmos DB users pick. The rest pick one of the three consistency models in between those two extremes: bounded staleness, session and constant prefix.

Nehme described the options as “a slider that allows Cosmos DB to behave like a relational database or a NoSQL database.”

“Implementing any distributed system involves a trade-off between, on the one hand, the degree of consistency it provides to users, and on the other its availability and response time,” Nehme said. Some customers need consistency enough that they’re willing to pay for that in performance; the other extreme has been NoSQL speed but no consistency guarantees.

Bounded staleness means that reads can lag behind writes, but only by a fixed amount (in seconds or numbers of operations) and write order is guaranteed; that’s a good match for gaming or where you need to monitor a sensor and take action if a problem reading occurs. Session-based consistency guarantees monotonic reads and writes, in sequence, within a session; the latency is better than bounded staleness but the global ordering might not be.

“If you have devices with data that’s cached locally session consistency can go a long way to solve the problem of having unique data at the edge [of the network]; when the device gets online that cache has to be converged,” Shukla explained. This is the most popular model for developers using the service. “The reason session is a sweet spot is because it enables all these scenarios without you having to choose.”

The newest model is a consistent prefix, which guarantees you won’t get gaps. “If you’re operating on a record, version by version, you won’t get gaps in those versions. If you see version one of a record, then you’ll see versions two, three and four in that order; you won’t see version four arriving and then it went back to version one and then on to version seven,” explained Shukla. “This is a much stricter guarantee than eventual consistency gives, but you get high availability and low latency.”

Consistent prefix is “very good for building messaging and queuing systems,” Shukla said.

Cosmos DB is not the only distributed database service with variable consistency levels.

Google Spanner recently added bounded and exact staleness, with a maximum staleness of one hour. Spanner’s consistency models however currently cover only a single region. This will increase to three data centers later in 2017. Azure, in contrast, offers all consistency models across all 38 of its data centers.

Cosmos DB is a ‘ring 0’ service in Azure so as new Azure regions are rolled out, Cosmos DB will always be on the list of available services, according to Microsoft. You can start with one region or many, and you can add and remove the regions you want your data to be available in, without any downtime. You can also use policy to geofence data into specific regions if you’re covered by regulations.

The team used the TLA+ specification language created by Turing Award winner Leslie Lamport to specify the different consistency models.

“Consistency is no longer a theoretical thing; it’s a reality that developers are facing,” Shukla pointed out. If you’ve got customers around the world and you want fast performance, you have to distribute your database and that means handling consistency.

Shukla compares defining these consistency models for distributed data storage to the codified isolation levels that relational database models created; now you know what trade-offs you’re making. And he noted, “you can save a lot of money by choosing any of those three compared to strong consistency.”

Cosmos DB can be used to add personalization to applications.

Performance and Promises

The Cosmos DB service level agreements (SLAs) put a price on delivering the promised latency, throughput and availability. “Developers want predictable performance for unpredictable needs,” explained Nehme.

The SLA says 1KB reads from Cosmos DB in the same Azure region will take less than 10ms and indexed writes less than 15ms — but that’s the worst case scenario and the median results are actually less than 2ms and 6ms, respectively, according to Nehme, and that’s with data encrypted at rest. And because you can distribute your data into multiple regions, you can get that low latency wherever in the world you need it. The SLA for distributing your database into another Azure region is 30 minutes, but again, the data movement happens faster than that on the service.

Cosmos DB can elastically scale both storage — which Shukla called a relatively easy problem “because it takes time for a table to grow to a petabyte or for data to be deleted” — but also throughput. You can change the number of transactions per second in your code, or set different regions to have different throughput rates based on the time of day. “Dealing with latency and throughput forces you to create a resource governed stack. It’s a very difficult distributed systems problem.”

Currently, Cosmos DB charges you by the hour. But if you’re provisioning for peak usage the day you launch a new online service or game, you want more granularity than that, so Microsoft is introducing request units per minute. You can choose how many requests per minute you want your database to handle, and rather than dropping requirements if they exceed the limit you’ve set, Cosmos DB can borrow from your budget for the hour to handle sudden bursts.

Alongside those SLAs are manual and automatic failover options — because the way to see if consistency works is to see what happens when the inevitable network failure happens and partitions your database. As you create Cosmos DB regions in the portal and populate them with data, you can set the priorities of the different regions and if a failover happens it follows that order.

As well as local persistence and replication within and across regions, Cosmos DB takes periodic backups; if you accidentally delete data you can contact Azure support to get it restored (and there’s an API coming to let you restore the data yourself).

Feature image via Pixabay.

The post Cosmos DB: Microsoft Azure’s All-in-One Distributed Database Service appeared first on The New Stack.

↧

Twilio’s Quest to Offer All the APIs for Modern Day Messaging

June 7, 2017, 1:00 am

≫ Next: FPGAs and the New Era of Cloud-based ‘Hardware Microservices’

≪ Previous: Cosmos DB: Microsoft Azure’s All-in-One Distributed Database Service

If APIs are eating the world, as they say, Twilio is taking a bite out of engagement portion of this API world.

These days, Twilio might be best known as the service that Uber has been using to provide the SMS channel for drivers and riders to communicate with each other; It offers handy APIs that you can call in your code to make phone calls or send text messages without thinking about cellular providers or monthly minutes.

But over the last nine years, Twilio has gone from providing a handful of telephony APIs to offering a comprehensive set of communications and engagement services that you can plug into your code, or use together as a platform.

Twilio services fit in a sweet spot for developers between low-level protocols like SIP and WebRTC, platform and provider-specific services like Apple push notifications and Facebook Messenger, and the old-school telephony setups that are the antithesis of agile development and cloud services.

“I’m a developer. At every single company I started, I wanted to build things to communicate with my customers; I wanted the ability to write code to communicate with my customers,” Twilio CEO Jeff Lawson told The New Stack. “But every time I looked at how you’re supposed to do this, you get a bunch of application boxes, a bunch of PBXs, run a bunch of lines from the carrier. [You] plug in the gear and then bring in a professional services army to try and get it into commission and get it to try and do what you want. And every time you want to make a change you have to go back to professional services.”

The slow turnaround was as much of a barrier to innovation as the high costs. Once, Lawson was quoted $2 million for a system to allow customers to call in messages.

“It’s not designed to be a platform you can build on, it’s not designed for developers to realize new ideas. It’s the opposite of the software ethos, that idea that first of all you get something up and running, then you listen to customers and you iterate until you get it right; that’s the superpower of software,” he said. “But with the time it takes to make changes, it would be two years before I started getting feedback from my customers on whether the thing I built was useful to them. That’s crazy!”

API All the Things

Lawson’s view is that everything should be an API and that the APIs Twilio offers have to stick around as long as the services they work with do. “We’ve never killed an API,” he noted. But those APIs have built up into what he called a ‘programmable communications cloud’ on top of the communications network Twilio built, with a wide range of integrations to other services.

“We started with voice and SMS and we started layering on more carriers and we realized we’d built a network we could build APIs on top of, to control the flow of the call and give you a message flow, and insights, and a media engine for smarter content. And then developers built a layer of engagement on top of that,” he said.

Twilio can sell you SIM cards designed for Internet of Things devices that you can configure through Twilio’s cloud services, so when they arrive they’re ready to plug in and start sending back sensor data. That’s a U.S. service now with roaming but it’s coming to Europe and other geographies this summer. It has the Authy authentication service, which powers a new Android user verification service that combines push authentication for signup with checks on whether that user is on a real phone number or a VOIP line that might be used by a scammer.

There are APIs for fax as well as for video conferencing (including screen sharing between a small group or recording video rooms with up to 50 people in and detecting who the current speaker), or for monitoring the quality of voice calls (including whether the network is congested or the user has muted their microphone). Call controls can reroute calls that don’t get through to a shortcode to a full phone number, convert MMS messages to SMS format or redact PII in messages and handle opt-out processing for compliance.

The range of Twilio APIs for working with voice calls.

The voice APIs add up to what Lawson called “programmable voice,” everything from dialing a phone number to handling a WebRTC connection, to conferencing other people in and recording the call. You can also monitor call quality and carrier connectivity.

Twilio’s media engine can already tell when your call goes to an answering machine (so you could switch from a live agent to having an API speak your own recorded message, correctly left after the beep), and it can do speech recognition (using Google’s voice recognition service, at 2 cents per recognition) and feed the results into a natural language understanding service.

“No matter what the channel is; voice or SMS, chat or Alexa or whatever, wherever customers talk to you Twilio will understand that. You train it once, you write your app once and you run it everywhere your customers may want to talk to you,” recognized Lawson said.

Once the speech recognized, Twilio’s natural language understanding extracts the intent and the entities mentioned in the sentence – going from a phone call to a booked flight.

Engagement Tones

Noticing some common patterns for how developers use Twilio APIs led to building what Lawson dubs “declarative” APIs, which make up an “engagement cloud.”

“Most REST APIs are imperative, you’re sending commands. But sometimes you want best practices and common patterns. That’s why we pick development frameworks; because they allow us to focus on the building, not the plumbing,” Lawson said. “You tell us what you’re trying to accomplish and let us apply common patterns and best practices and do that for you. You get flexibility, but you also get to production faster.”

That approach led to APIs like:

Notify: For sending notifications over multiple channels of communications including mobile and web push notification.
Sync: For sharing state between devices or applications.
Proxy: For putting two users in touch without revealing their private phone numbers.
TaskRouter: For building call center flows.

Increasingly, Twilio is going beyond telephony because the way we communicate on our phones isn’t just voice and text; Channels is a new service that works with Notify, Proxy and Twilio’s SMS API to let you send messages to users on a range of services — everything from SMS and Facebook Messenger to Alexa notifications, Twitter DMs, Kik, Viber, WeChat, BlackBerry Messenger, HipChat and Slack, as well as SendGrid for email.

That list is going to get longer as new services arrive. “We built this really cool engine that allows you to do the complex configuration to onboard a new API for a new endpoint in days; when Facebook announces something when Google announces something, we’ll be able to incorporate these channels really efficiently,” Lawson explained.
It’s not just about connecting a new service thought; Notify also understands what you can do on different services.

“What we’ve done is build out abstractions for key types of engagement; there’s the one-to-zero use case, that’s notifications, there’s one-to-one, many-to-one, one-to-many. So a one-to-one conversation is Twitter direct messages, many-to-many is Slack, but these are interaction models that end up getting replicated on multiple services. The key is having the right abstractions to plug things into,” Lawson said.

You can use all of those directly in code, or through what Lawson referred to as the Twilio “runtime.” Stretching the usual definition somewhat, this covers the portal on the Twilio site where you can write code, with a built-in debugger, the new serverless Functions tool that abstracts away the usual infrastructure you’d use to run code with a visual programming interface that exposes all the other Twilio services. If you’re using the Chat API to build a chat system, Twilio now offers templates with user interface controls for design.

All this vastly simplifies communications on traditional telephony and across the mix of online tools and services where your users are, even if you only use a single API. Lawson was typically enthusiastic about the potential for developers; “A single developer can command power that entire nations didn’t have access to a few years ago.”

Feature image: Twilio CEO Jeff Lawson demonstrating speech recognition and language understanding on stage at the recent Twilio Signal conference. All images by Mary Branscombe.

The post Twilio’s Quest to Offer All the APIs for Modern Day Messaging appeared first on The New Stack.

↧

FPGAs and the New Era of Cloud-based ‘Hardware Microservices’

June 8, 2017, 6:00 am

≫ Next: Twilio Previews a Serverless Capability, Called Functions, to Manage Messaging Apps

≪ Previous: Twilio’s Quest to Offer All the APIs for Modern Day Messaging

In his keynote at the Microsoft Build conference earlier this year, the head of Microsoft’s AI and Research Harry Shum hinted that at some point the Microsoft Azure cloud service will give developers access to field programmable gate arrays (FPGAs). Azure Chief Technology Officer Mark Russinovich also talked about Azure exposing “[FPGAs] as a service for you sometime in the future.”

What is that FPGA-powered future going to look like and how are developers going to use it?

FPGAs aren’t a new technology by any means; Traditionally, they have been reserved for specialized applications where the need for custom processing hardware that can be updated as very demanding algorithms evolve outweigh the complexity of programming the hardware.

With processors, Russinovich explained to the New Stack, “the more general purpose you are, generally, the more flexible you are, the more kinds of programs and algorithms you can drop throw at the compute engine — but you sacrifice efficiency.”

The array of gates that make up an FPGA can be programmed to run a specific algorithm, using the combination of logic gates (usually implemented as lookup tables), arithmetic units, digital signal processors (DSPs) to do multiplication, static RAM for temporarily storing the results of those computation and switching blocks that let you control the connections between the programmable blocks. Some FPGAs are essentially systems-on-a-chip (SoC), with CPUs, PCI Express and DMA connections and Ethernet controllers, turning the programmable array into a custom accelerator for the code running on the CPU.

The combination means that FPGAs can offer massive parallelism targeted only for a specific algorithm, and at much lower power compared to a GPU. And unlike an application-specific integrated circuit (ASIC), they can be reprogrammed when you want to change that algorithm (that’s the field-programmable part).

FPGAs have much more data parallelism than CPUs.

“FPGAs hit that spot, where they can process streams of data very quickly and in parallel,” Russinovich explained. “They’re programmable like GPU or CPU but aimed at this parallel low-latency world for things like inference and Deep Neural Networks; if you need to do online speech recognition, image recognition it’s really important to have that low latency.”

The disadvantage is that the programming and reprogramming is done in complex, low-level hardware definition languages like Verilog. Rob Taylor, CEO of ReconfigureIO — a startup planning to offer hardware acceleration in the cloud by letting developers program FPGAs with Go — told the New Stack that there simply aren’t many hardware engineers who are familiar with these.

Most FPGA development takes place at processor development companies. And the very different programming model, where you’re actually configuring the hardware, is challenging for developers used to higher level languages.

“As a software engineer, you can start writing simple hardware but writing capable hardware takes several years of learning to get to right,” Taylor said. In rare cases, it’s possible to program an FPGA in a way that permanently damages it, although the toolchain that programs the hardware should provide warnings.

This is one of the reasons FPGAs have never become mainstream, Taylor suggested. “It’s the cost of doing FPGA engineering. If you can only hire a few expensive engineers, there’s only so much you can do. You end up with very vertical specific solutions and you don’t get the bubbling innovation that, say, the cloud has brought.”

Nonetheless, Taylor sees FPGAs as a good solution for a range of problems. “Anything where you have data in movement and you’re processing that and getting an answer and responding to it or sharing that answer somewhere else. You could build an in-memory database on FPGA to do statistical analysis blazingly fast without going near the CPU.” Such applications could include image and video processing, real-time data analytics, ad technologies, audio, telecoms and even software-defined networking (SDN), which he noted is “still a massive drain on resources.”

The ReconfigureIO approach uses Go Channels, which Taylor said fit the model of FPGA pipes, “but we’re working on an intermediate layer, which we want to have be standard and open source that will let people use whatever random language they want.”

The complexity of programming them is why the Amazon Web Services FPGA EC2 F1 instances that let you program Xilinx FPGAs are targeted at customers who already use FPGA appliances for their vertical workloads in genomics, analytics, cryptography or financial services and want to bring those workloads to the cloud. AWS actually provides a hardware development kit for FPGA configurations. Some of those appliance makers like Ryft will be providing APIs to integrate the AWS FPGA instances with their analytics platforms the way their FPGA appliances already do.

The bandwidth between two VMs inside Azure, even with a 40 gigabit network adapter on each VM, is only around 4Gbps per second; with FPGA-accelerated networking, that goes up to 25Gbps.

FPGA vendors are starting to offer higher level programming options, like C, C++ and OpenCL. AWS is relying on OpenCL FPGA programming to reach more developers in future, although this still requires a lot of expertise and isn’t necessarily a good match for the FPGA programming model.

“It’s still a very esoteric type of development environment,” Russinovich noted; “but I think the trend is clear that things are going to get more and more accessible. I think you can imagine at some point — I’m talking a far future vision here — developers using different languages to write programs with tools that will take a look at your algorithm and determine, based on profiling or analysis, that this piece of your program is more efficient if we run it on FPGA and this one on GPU and this one on CPU, and developers just take advantage of the best capabilities the platform has to offer.

Smart Networks

Microsoft is taking a rather different approach. On Azure, you can actually use FPGA-powered services already; you just don’t know that you’re using FPGAs — in the same way that you don’t know you’re using flash SSDs when you use Cosmos DB or GPUs when you use Microsoft Cognitive Services. In fact, the whole Azure network relies on FPGA-powered software-defined networking.

When Microsoft first started putting FPGAs into Azure, it was to scale low latency and high throughput to systems with very large amounts of data and very high traffic; the indexing servers for Bing. Initially, those FPGAs had their own network connections, but to simplify the network topology Microsoft switched to connecting them to the same NIC as the server they were in. Once the FPGAs were connected directly to those network cards, they could also accelerate the software-defined networking that Azure uses for routing and load balancing.

The impact of FPGAs on query latency for Bing; even at double the query load FPGA-accelerated ranking has lower latency than software-powered ranking at any load.

Like custom silicon designed to go on a network card, these FPGA SmartNICs are more efficient than CPUs and use less power. But as Microsoft improves that software-defined networking stack to work with the 50GB and 100GB network adaptors that are coming soon, the FPGAs can be reprogrammed — which you couldn’t do with custom silicon.

These SmartNICs already implement the flow tables that are the basis of Azure’s software-defined networking; in future, they might also implement Quality of Service or RDMA, and speed up storage by offloading cryptographic calculations and error checking.

Azure Accelerated Networking has been available on the larger Azure VM sizes since last year, for both Windows Server and Ubuntu, although the service is still in preview and has what Russinovich called “extremely rare compatibility issues,” so you have to choose to use it. It also has some limitations, like needing separate Azure subscriptions if you want to use it for both Windows Server and Linux. The bandwidth between two VMs inside Azure, even with a 40-gigabit network adapter on each VM, is only around 4Gbps per second; with FPGA-accelerated networking, that goes up to 25Gbps, with five to ten times less latency (depending on your application).

The impact of FPGA-accelerated SDN (credit Microsoft).

The next step is building services for developers to use those FPGAs, even if it’s indirect. “There are multiple ways to make FPGAs available to developers, including us, just using them for infrastructure enhancements that accrue to every developer that uses our cloud, like SDN,” Russinovich explained. “We want to make deep neural network [DNN] and inference models available to developers, that are easy to deploy and easy to consume, and that’s running DNN on top of FPGA so they get the best performance. They would do their training for example on GPU, and bring us the models. The developers aren’t aware it’s FPGAs underneath; they just hand the DNN model to the platform and the platform executes it in the most efficient way possible.

Different ways developers will use FPGAs in Azure (credit Microsoft).

Russinovich demonstrated the advantage of that at Build, with what he called “tens to hundreds of tera-operations, so you can get really effective inference.” Running the same machine learning algorithm on 24 FPGAs rather than 24 CPUs, he showed a 150-200x improvement in latency and around 50 times the energy efficiency.

Developers can already take advantage of this through the Microsoft Cognitive Services APIs. “We’ve already got this in production in Bing as part of the next level of acceleration for Cognitive Services training, as well as for Bing index ranking.”

Hardware Microservices

Although each FPGA deployed in Azure is all on the same motherboard as a CPU and connected to them as a hardware accelerator, they’re also directly connected to the Azure network, so they can connect to other FPGAs with very low latency, rather than being slowed down by piping the data through the CPUs.

That gives you much better utilization of the FPGAs, and the flexibility to still use them for acceleration as part of a distributed application that also runs on CPUs, or for experimenting with algorithms for acceleration that you’re still developing. “If you’re not sure of the optimal algorithms for say compression or encryption for the data you’re processing, or the data shape is going to be changing over time so you don’t want to take the risk of burning it to the silicon, you can experiment and be agile on FPGAs,” Russinovich told us.

A management fabric co-ordinates those directly connected FPGAs into applications, so different layers of a DNN for a model that’s been pre-trained with TensorFlow or Microsoft Cognitive Toolkit (CNTK) could be on different FPGAs — giving you a way of distributing a very deep network across many devices, which avoids the scaling problems of many DNN frameworks.

Distributing a DNN across the Azure FPGA hardware microservices fabric (credit Microsoft).

“This is a vastly more general usage for FPGAs, where we think there is potential for lots of innovation, that we call hardware microservices,” Russinovich told us. “If you have a large fleet of FPGAs and they’re directly connected to the network and programmable through the network, then what kinds of apps can you build that are accelerated in ways we can’t achieve on standard kinds of hardware that we’ve got today? We’re using that infrastructure first for our DNNs, but we see that becoming a general-purpose platform.”

He talked about that fabric having web search ranking, deep neural networks, SQL accelerations and SDN offload programmed into it. “Azure Data Lake Analytics is looking at pushing machine learning also into FPGAs.”

Will developers end up writing their own applications to run on that hardware microservices reconfigurable compute layer, or will they use FPGAs broadly in any way? Russinovich predicted a mix of the ways developers will use FPGAs.

“There will be developers that will directly take advantage of these things but I think many developers will end up indirectly taking advantage of this by leveraging libraries and frameworks that include those things for them, or using microservices models provided by ISVs or the open source community.” Further down the line, he suggested that could work much the same way containers do.

“Today in the developer space if I want a REST front end, I just pull a node.js Docker container. I don’t have to write it myself, I’m just using it. I think you’ll see the same model, where you’ll say I want this algorithm, I want the most efficient possible deployment of it and I’ll be deploying it on FPGAs even though I’m not directly writing the code that goes onto FPGAs — and maybe I’m even getting it from a Docker repository!”

FPGAs make sense for cloud providers that have the expertise to work with them, but they might also show up wherever you’re collecting a lot of data. “I definite think there’s a place for FPGAs on the edge, because you’re going to have a lot of inference happening on the edge. Instead of sending data up into the cloud you do processing right there and you can do incremental training on top of the FPGAs, as well as having the models evolve with the data that’s being consumed on the edge.”

That’s all some way off, Russinovich noted, but just as GPUs became a standard development tool for certain problems as Moore’s Law slowed down for CPUs, so might FPGAs — whether developers know they’re using them or not.

“We’re at the early stages of opening this up and making it accessible not just to vendors like Xilinx and Altera, but to startups who are looking at higher level programming languages for FPGAs. I think we’re at the first wave of this new generation that’s kind of a rebirth of this technology, which seems to come and go — every five to ten years it gets hot and then fades away, but I think it’s here to stay this time.”

Feature image via Pixabay.

The post FPGAs and the New Era of Cloud-based ‘Hardware Microservices’ appeared first on The New Stack.

↧

Twilio Previews a Serverless Capability, Called Functions, to Manage Messaging Apps

June 28, 2017, 2:00 am

≫ Next: Microsoft Trims Nano Server for Container Work

≪ Previous: FPGAs and the New Era of Cloud-based ‘Hardware Microservices’

Twilio has launched a preview service, called Functions, that lets developers write and run serverless code within the Twilio Runtime console, giving them more control over how to manage their Twilio API-driven messaging applications. This pre-configured environment has helper libraries, API keys, asset storage and debugging tools, which can be accessed inside the Twilio web portal.

“The primary reason we built Functions was to improve the speed of development, to make it so you don’t need to think about ‘how do I scale web infrastructure?’” Twilio messaging general manager Patrick Malatack told The New Stack. “Now you don’t need to think about scaling, you just deploy it to Twilio. We abstract away all the OS, all the hardware, and all the infrastructure and let you focus exclusively on your code.”

Twilio Functions isn’t intended to compete with general use serverless environments, such as Azure Functions or AWS Lambda. In fact, it runs on Lambda. Rather, it can help developers better scale and manage their Twilio-based messaging apps.

Twilio has had a web-based GUI for letting non-developers create small voice and SMS applications (called Twimlets) for some time. You can create TwiML (Twilio Markup Language) apps and TwiML Bins to store that code that handles incoming call events, in the Twilio portal.

Usually, the first step in building an app using Twilio APIs is standing up a public web server to receive the webhooks, those APIs used to connect to remote software, and to manage the authentication and access tokens to do that securely. That’s one more thing to troubleshoot when you’re prototyping and app, and when you put an app into production you need to make sure that the server scales to cope with the demand.

Functions does away with the need for that server, removing one more thing that can go wrong. It lets developers use templates with pre-written code and configuration or write Node.js code. There are a handful of templates to handle common patterns like call forwarding and setting up a conference or creating an access token for Twilio’s Sync service, with more in development.

Creating a new Twilio function from a template or with your own node.js code.

“I use Sync for my state and Functions for all my compute and I can build any communications experience on Twilio,” Malatack told us.

Start with one of the templates and the code editing window is pre-populated with code, including the credentials and environment variables passed in from the function configuration (stored as key-value pairs), the event information passed in from the Twilio API you’re working with (using GET and POST) and the callback method you call to return from the function (which can be TwiML, JavaScript, a string — or a string that you use to indicate an error).

You can use the Twilio Node helper library to generate TwiML for handling voice and messaging events programmatically, and you get the same debugger support that’s been in the Twilio runtime for a couple of years. “All the webhooks that Twilio apps are generating; sometimes they hit a server that’s down, so you need to be able to debug that. Functions has that debugger support out of the gate, so you can see what happened to my code, what went wrong?” Malatack said.

Functions also auto scales. A user gets 10,000 free function invocations a month and they cost $0.0001 per invocation beyond that — volume pricing will be available once Functions is out of preview. You can also store arbitrary content for your app in the Twilio Assets service.

Functions covers all the Twilio APIs, and developers can use the built-in got module to handle third-party REST APIs. That’s not the focus for Functions, Malatack explained, so the experience is fairly basic. “Our focus is on making it the best place to build for Twilio. But if you want to have, say, a Stripe webhook hit a Twilio function, technically that works and there’s no reason for us not to support that.”

You can hook your function up to Twilio phone numbers in the Twilio web portal, or you can use the URL shown in the console to call your function. There’s a handy Copy button that lets you test your function in a web page.

Functions code in the Twilio consoles.

Currently, you’re working with Twilio Functions in the browser rather than an IDE; “We’re thinking a lot about how do we integrate with your existing toolchain,” Malatack told us, so as Functions matures we’re expecting to see ways to build, test and deploy Functions from other developer tools.

Removing the Distractions

The serverless programming model is a logical next step from webhooks because you get the event-based programming without the overhead of hosting a web service, and Twilio isn’t the only one picking up this idea.

The new MongoDB Stitch service for MongoDB Atlas is labeled as a Backend as a Service, but you get pipelines for handling authentication, data access control and integration with communications, messaging and payment APIs (including Twilio’s) without spending as much time on coding and infrastructure for those tasks.

If you’re writing a software-as-a-service app that you’d like customers to be able to extend themselves, Auth0’s Extend service gives you a React embedded code editor lets customers create serverless node.js extensions. Similarly, Zapier’s Developer CLI tool lets you write, test and deploy Node.js apps as part of a Zapier integration flow if the inputs and outputs of the APIs you’re integrating need a certain amount of massaging or tweaking; again, that uses AWS Lambda to run your custom code.

Twilio Functions shows serverless is as much a programming style as a service, and you can expect to see the option of running functions without caring about the infrastructure showing up as an option in more places.

Every Twilio function has a fully qualified URL you can use to invoke it.

“Think about all the time and effort that is spent on things that are not your code and the experience you want,” Malatack pointed out. “Part of the reason why we see the amazing user experience we now have is that the abstractions keep getting better and better, so you can spend more of your time around ‘What’s the right outcome I want’ rather than all the things I need to do on the way to get there.”

Feature image by Shwetha Shankar via Unsplash.

The post Twilio Previews a Serverless Capability, Called Functions, to Manage Messaging Apps appeared first on The New Stack.

↧

Microsoft Trims Nano Server for Container Work

June 29, 2017, 2:00 am

≫ Next: Salesforce’s Einstein Mixes Automated AI with Business-Specific Data Models

≪ Previous: Twilio Previews a Serverless Capability, Called Functions, to Manage Messaging Apps

When Microsoft introduced Nano Server in Windows Server 2016, as a small-footprint server OS with no local logon, it was targeted at a few different workloads. The company has found the software is increasingly being used for one type of workload: managing containers.

As a result, the company has whittled down the core of Nano Server to just those components needed to run containers. It has also added in networking support to accommodate the needs of container orchestrators, notably Kubernetes and Swarm.

For Microsoft, Nano Server is the heart of Windows Server; a minimal refactoring that future versions of Windows Server will build on, with “just enough” of an operating system to run applications without introducing more of a risk surface than necessary. For customers, it was designed to run both containers and some key infrastructure roles.

“When we launched Nano Server last fall, it had a very small footprint for a compute cluster, a storage cluster and other infrastructure roles, but people weren’t using it for that, but they were using it for containers,” Chris Van Wesep, Microsoft Enterprise Cloud product marketing director told The New Stack. The anonymized telemetry in Windows Server 2016 showed that “the vast, vast majority of Nano Server deployments were for container scenarios not for the infrastructure scenarios.”

Nano Server will include more platform support for container orchestrators like Kubernetes and Swarm.

That’s why the next version of Windows Server, which will be available in the latter part of this year, removes those unused infrastructure roles from Nano Server to make a smaller image still. A smaller image means higher density, faster startup times and the ability to deploy enough containers on a laptop to be consistent with the production environment on the server for development and testing.

Those roles will be in Server Core, which is what Azure and Azure Stack run on, and is what you’ll use to host your Nano Server images. Hyper-V containers also need Server Core, not Nano Server

Smaller, Faster, More Containerized

“The feedback we got on Nano Server was ‘it’s good, but it could be better’,” Van Wesep explained. “Customers are saying ‘You got it down to sub 500MB, but I would like to see it 200MB or sub 100MB smaller.’ There is a tradeoff that needs to be made here and we’re making that tradeoff; Nano Server is an optimized-for-containers runtime image.”

At one point, the Nano Server image was 1GB, uncompressed, Taylor Brown from the Windows Server team told us. “We have a lot of components that really weren’t relevant to containers but they were relevant to physical machines and virtual machines, like the recovery shell. They weren’t optional components; they were just parts of the OS that we couldn’t pull out in a container-native way.”

For the next release, Microsoft is already committed to getting the uncompressed image down to 500MB and Brown called that conservative; “I think we’re actually going to exceed that in the first release and we’re really committed to continuing down that road. The pull size, the compressed size, and the on-disk size will both get much, much smaller.” In the long term, the goal is to have a Nano Server image with .NET Core in be the same size as a Linux image with .NET Core.

That means removing a lot of components. “We’re gutting Nano Server,” Taylor said. “We’re pulling out WMI, we’re pulling out IIS, we’re pulling out .NET, we’re pulling out PowerShell, we pulled out every driver, we pulled out event loggers, we pulled out scheduled tasks.”

That might well break workloads, so Brown encouraged developers to join the new Windows Server Insider program to try this container-optimized version of Nano Server as soon as possible, so the team can add back any components that turn out to be necessary. “We’ve got to be aggressive; the vision is let’s cut and prune and get the thing as optimized as possible. Let’s not be afraid, let’s move forward fast and make this thing small.”

Components that used to be part of the images will now be optional. “You’ll pull down Nano Server and if you want .NET, you pull down .NET on top of that, if you want PowerShell you pull down PowerShell. If you don’t want .NET, you don’t have to have .NET, if you don’t want PowerShell, you don’t have to have PowerShell. These are all optional layers now and very native to the container experience.”

Windows Server 1709 will add support for mapping SMB volumes into containers (credit Microsoft)

Nano Server will include more platform support for container orchestrators like Kubernetes and Swarm. It already has network overlay support, so you can make native bridge networks across Swarm clusters, but the next release, named 1709, for the planned September release date, will add mapping named pipes into containers, which means you’ll be able to run container orchestrators as container images — for both Docker and Hyper-V isolation containers.

Nano Server will also let you hot-add network interfaces; that’s been a problem for Kubernetes, which starts a container without network adapters and then adds them to the container. Also useful for Kubernetes is what Brown called “initial support” for sharing network interfaces between containers; that will work with shared kernel containers in Nano Server 1709 but Hyper-V isolation containers in Server Core won’t get it until a later release. Mapping SMB volumes from file servers and named pipes will make it easier to connect storage to containers and have it remain accessible as containers move around.

Adding the Windows Subsystem for Linux to Server Core hosts (though not Nano Server) will also be useful for scripts that handle containers. You can even use Linux containers to build Windows container images, and vice versa, giving you a lot of flexibility to work with the tools you want.

Going GUI

Nano Server has always had a GUI — of sorts. The Azure Server Management Tools gave you the traditional task manager, registry editor, control panel, performance monitor, disk management, user and groups management, device manager, file explorer and even Hyper-V management through a web interface. It worked with Nano Server, Server Core and Server with the full desktop experience, all the way back to Windows Server 2012, but you needed both an on-premises gateway and an Azure account to use it. That service never came out of preview (so it was never supported for production workloads — previews are ‘use at your own risk’) and Microsoft is discontinuing it as of June 30, 2017.

That doesn’t mean Microsoft thinks you don’t need a GUI for troubleshooting. “We got a lot of feedback saying ‘we think the functionality is really neat but it would be great if you didn’t force us to go through Azure to get back to our on-premises stuff,” Van Wesep told us. He promised more information about what will replace SMT at the Ignite conference in September, but there will definitely be options. “We’re dialed into customers saying ‘If you’re going to keep pushing GUI-less things then you should give us a nice way to interface with them’.”

Nano Server and Server Core will both be part of the new Semi-annual Channel, getting updates twice a year. You need Software Assurance to get those updates — or to be using Azure or a hosting provider that provides them. Server Core is also available as Long Term Servicing Channel (which is the only way you can get Windows Server with Desktop Experience). That’s because the container model Nano Server is now aimed at is still evolving and new features will arrive regularly.

If you want to run containers and container orchestrators on your own infrastructure rather than handing over the update work to a cloud provider like Azure Container Service, you need to be prepared for regular updates.

Nano Server and (if you want) Server Core will get updated every six months through the Semi-annual Channel (credit Microsoft)

Six-monthly updates should be less work for DevOps teams to adopt than big updates every three years, Brown suggested. “When we have long release cycles, a lot of stuff changes and it takes a long time to validate it and certify it and get it into production. A shorter release cycle should be a lot easier.” In general, he noted that the developer audience is more open to updating Windows Server frequently than admins running infrastructure roles like Active Directory, “as long as you give me new features.”

And Nano Server is proving very popular for containers, he told us. “We’ve got crazy Nano Server adoption for containers; the numbers are shockingly high.”

Feature image via Pixabay.

The post Microsoft Trims Nano Server for Container Work appeared first on The New Stack.

↧

Salesforce’s Einstein Mixes Automated AI with Business-Specific Data Models

July 7, 2017, 3:00 am

≫ Next: Microsoft Prepares SQL Server 2017 for Linux and Containers

≪ Previous: Microsoft Trims Nano Server for Container Work

With its Einstein service, Salesforce mixes automated AI with custom data models developers can create for dealing with the specific needs of their customers.

Despite the white-haired personification it uses in marketing, Salesforce Einstein isn’t an AI assistant like Alexa or Cortana; instead it’s a set of AI-powered services across the range of Salesforce offerings (Sales Cloud, Commerce Cloud, App Cloud, Analytics Cloud, IoT Cloud, Service Cloud, Marketing Cloud and Community Cloud).

Some of these work automatically in the standard Salesforce tools, like SalesforceIQ Cloud and Sales Insights in Sales Cloud. Turn these on in the admin portal and you can add the Score field to a Salesforce view or use it in a detail page in a Lightning app to see lead scoring that suggests how likely a potential customer is to actually buy something, as wekk as get reminders to follow up with specific people to stop a deal going cold or see news stories that might affect a sale (like one customer buying out another).

Einstein tools are (or will be) available for a range of different Salesforce cloud services (credit: Salesforce).

For the Marketing Cloud, Einstein provides similar predictive scores to suggest which customers will buy something based on a marketing email and which will unsubscribe when they get it. It also groups potential customers into audience segments who share multiple predicted behaviors and suggests the best time to deliver a marketing email. Service Cloud Einstein suggests the best agent to handle a case.

Commerce Cloud automatically personalizes the products on the page with product recommendation and predictive sort views (and you can customize that with business rules in the admin portal). Machine learning-based spam detection for Salesforce Communities is in private preview, learning from the behavior of human moderators what’s an inappropriate comment. Those scores and insights use the structured data that Salesforce stores in the CRM, for example when a salesperson marks an inquiry as a sales opportunity (as well as email data from Office 365 and Google that you can connect).

Prices and features for Salesforce Einstein AI services (credit Salesforce).

Because such a wide range of businesses use Salesforce, the same data models and algorithms wouldn’t work well across all of them. So the Einstein tools automatically build multiple models, transform the tenant data (which is stored in Apache Spark) and evaluate which models and parameter choices give the most accurate predictions for each new customer — so one Salesforce customer might have data that can be best analyzed with a random forest algorithm and another might get better results with linear regression.

That’s all automated in the Salesforce platform, down to being able to detect what language your customers are talking to you in. Your business needs to be using Salesforce enough to create sufficient data for it to learn from; for predictive lead scoring that means at least 1,000 leads created and 120 opportunities converted to sales over the last six months, at a rate of at least 20 a month. The more information the sale team puts into those records, the more accurate the lead insight is likely to be. In fact, before you can turn on Sales Cloud Einstein, you can to run the Einstein Readiness Assessor, which builds and scores the models to see if there’s enough data to generate useful predictions and if the predictions are going to help their business.

Machine Learning for All (According to Their Needs)

“We evaluate customers who are interested in Einstein based on the size and shape of their data and make recommendations based on how useful it will be to them,” Vitaly Gordon, vice president for data science and engineering on Salesforce Einstein said at the most recent TrailheadDX, which is the company’s developer’s conference.

There are some companies where 50 percent of their leads become opportunities and for others, it’s only 0.1 percent; knowing who to target would be very useful for the second kind of company but even a very accurate prediction about lead conversion wouldn’t make much difference to the first business, he pointed out. “Sometimes it’s more about anomaly detection, which needs a different set of algorithms,” Gordon said.

And if you only have ten leads a day, just call them all; “AI is not a ‘one glove fits all’ tool and not every problem needs AI,” Gordon said.

To help customers trust these automated machine learning systems, Salesforce shows why a particular lead has been scored high or low. “We explain why we believe a lead will work; we explain which models are influencing the score and what in the data is signaling that the opportunity will convert,” Gordon said.

Those scores also give you the expected accuracy of the prediction. “We say we think this is the right next step, with say 82 percent or 77 percent accuracy, so it’s like a guidepost,” Gordon said.

Those customer-specific models also get updated automatically as new data comes in, and a percentage of the data is reserved for on-going testing as well as training — so Einstein can track the accuracy of predictions and spot changes to your data that mean a new model is needed. That would show up in predictors used to explain the score, which should also help users to feel comfortable about the predictions. The systems also look for “leaky features”; predictions that are too good to be true, because they’re predicting a combination of events that will never actually happen.

For developers, Salesforce has three machine learning APIs for building custom models using deep learning on unstructured data: for vision, sentiment and intent. The Einstein Object Detection vision API is now generally available; the Intent and Sentiment APIs are in the preview.

Training the convolutional neural network behind the Object detection API can be done by zipping up folders of images in each of the categories you want to recognize (the folder names become the feature categories) and uploading those to Salesforce. You see 200-500 images per label, with a wide range of examples and a similar number of examples per label.

That creates an endpoint that you can send new images to, to detect objects, like a pair of shoes or a pair of pants, and it can also classify the image by counting how many pairs of shoes and pants are in the image and what color they are, or recognizing that a shelf is empty and needs to be restocked with products.

Image recognition might help make lead scoring more accurate; Someone installing solar panels could use the API to look at the address on Google Maps and see whether the roof type is suitable for panels.

The API can also be used for anomaly detection; if employees are uploading a large number of images, they can use image classification to suggest any that might not be relevant to the task. Image recognition doesn’t yet extract text from images using OCR, but Salesforce is working on this.

Each recognition comes with a certainty percentage. The API automatically reserves some of the training data for testing, and developers can see the test and training accuracy scores and a confusion matrix for the data model in the Einstein API playground. They can determine if they need more data for training — perhaps by labeling false positives and negatives for it to learn from.

Developers can add a button to the interface in the app to let customers or employees mark images that have been recognized incorrectly. As this is a custom model, developers will need to evaluate the accuracy over time yourself; the system won’t warn you if the accuracy scores are falling.

The Intent API tries to extract what a customer wants from the text of their messages. The API can be trained by uploading a CSV file with two columns, one for the phrases customers are likely to use (like “I can’t log in” or “my bike wheel is bent”) and the labels those phrases represent in your process (like “password help” or “customer support”).

For accurate predictions, Salesforce suggests over 100 phrases per label, and this is an asynchronous training phase, so it will take a little time before it’s ready to call the API. Currently, the API only looks at the first 50 words of every message it scores, although that will be extended. It’s a good idea to have a mix of short and long phrases to avoid false correlations from a few examples of longer messages.

Intent in conjunction can be used with the sentiment API to tell when a customer is unhappy, or it could be used keep an eye on customer communications generally to see if they’re positive, negative or neutral. Again, sentiment is a pre-trained model but uploaded data labels like product names or specific positive and negative terms make the sentiment fit your domain better. There’s a limit of 1GB per upload to all the APIs, but you can make multiple uploads.

The Einstein Sentiment API classifies the tone of text — like emails, reviews and social media posts — as positive, neutral or negative (credit Salesforce).

Salesforce is working on several other AI tools. Heroku Enterprise supports the Apache PredictionIO open source framework (with Kafka as a Heroku service for streaming big data to Heroku) and Salesforce is creating a wrapper to make that easier to build your own custom intelligent apps using PredictionIO.

There’s also integration with IBM Watson APIs. Salesforce Chief Product Officer Alex Dayon explained these as being different layers of data.

“Einstein is in the Salesforce platform using Salesforce data. Watson is a separate set of libraries and data sets, like weather prediction data, and we want to make sure that the data sets can be shared and that Watson can trigger Salesforce business processes,” Dayon said. When using Watson predictive maintenance, that service could send an alert that a piece of machinery was going to have a problem, and Salesforce Field Service could automatically dispatch a technician.

“We’re trying to make sure that Einstein can tap into data sets coming from other platforms,” said Dayon.

Feature image by Seth Willingham on Unsplash.

The post Salesforce’s Einstein Mixes Automated AI with Business-Specific Data Models appeared first on The New Stack.

↧

Microsoft Prepares SQL Server 2017 for Linux and Containers

July 18, 2017, 2:00 am

≫ Next: How Microsoft Deployed Kubernetes to Speed Testing of SQL Server 2017 on Linux

≪ Previous: Salesforce’s Einstein Mixes Automated AI with Business-Specific Data Models

The first release candidate of Microsoft’s SQL Server 2017 is available this week, adding a handful of smaller updates to the major new features in this release, which comes a little more than a year after SQL Server 2016. The best-known new feature is support for Linux (RHEL, SUSE Enterprise Linux and Ubuntu), and for containers running on Windows, Linux and macOS; that includes Always On availability groups for high availability integrated with native Linux clustering tools like Pacemaker.

RC1 adds support for Microsoft’s Active Directory authentication system for Windows or Linux clients to SQL Server on Linux using domain credentials and using the Transportation Layer Security (TLS) encryption scheme (1.0, 1.1 or 1.2) to encrypt data transmitted from client applications to SQL Server on Linux.

Machine learning is also a focus, program manager Tony Petrossian told the New Stack. SQL Server 2017 can run in-database analytics using R or Python, without needing to extract and transform data to work with it.

“To add support for R AI and machine learning workloads in SQL Server 2017, we built an extensibility model,” he explained, “so we can execute the R runtime with SQL server on a fast path of data exchange between the R environment and SQL. That means you can execute R script as part of your code but with that extensibility enabled the additional work to enable Python was pretty small.”

Not only did that prove that the extensibility model is flexible, but it also means “we get out of the way of any argument between data scientists about the supremacy of R versus Python; we’ll enable both.” RC1 also adds native scoring and external library management to R Services on Windows Server.

SQL Server 2017 is a good example of the way Microsoft builds features first in Azure and then brings them to its on premises server products. As well as the same graph data features as Azure SQL Database, SQL Server 2017 also gets the Adaptive Query Processing performance improvements developed for the Azure database to optimize how queries are run (which can have a significant impact on query performance) by monitoring how well previous queries have run.

“This helps us be far more efficient in our use of resources within the execution of parallel queries and concurrent queries,” Petrossian explained. “The optimizer can adjust its behavior based on execution statistics that are coming through, as opposed to just trying to predict what things will be. As a result, customers will be able to run bigger queries and more concurrent queries.”

Initially, that has three modes, two in batch mode and one for interleaved execution. Now that SQL Server has the infrastructure for adaptive optimization, future releases will extend that throughout the database engine.

Production Ready

RC1 is close to a final version, Petrossian said. “We’re pretty much complete with the work and unless we find some serious bug, this is it.”

As usual, the new release won’t be fully supported until it’s generally available, but customers can use it in production if you want the new features. Several customers are already doing that, some with formal support from Microsoft to help test the new release as part of the Early Adoption Program. “We also have a couple of customers who just did it on their own and didn’t tell us,” Petrossian told us, although he noted those were “smaller workloads, dipping their toes in the waters.”

Financial analysis company dv01 originally created its reporting and analytics SaaS tools for bonds and loans on Python, Amazon RDS PostgreSQL, and Redshift data warehouse, but ran into performance and scale problems, with some queries taking longer than the 30-second timeout limit. Soon the engineers were spending more time tuning their database queries than building new features.

To get better performance and in-database analytics, they moved to SQL Server 2016, which meant using Windows Server on Azure. Query time went down to 1-2 seconds and better data compression reduced the amount of storage needed by two or three times, plus the data is encrypted in memory and on disk. With their other systems running on Linux and most engineers using Macs, after testing for a couple of months, they migrated their 40 production databases to SQL Server 2017 CTP2 running in Docker on Linux.

That’s exactly the kind of scenario that motivated Microsoft to bring SQL Server to Linux, Petrossian explained. “Aside from the obvious reason, that people are using Linux, one of the big motivators for us was that a lot of the container and private cloud technologies are built on the Linux infrastructure and we wanted SQL Server to be part of that modern IT ecosystem, whether that’s in public or private cloud or wherever it happens to be. Now that we have SQL Server running in Docker, you can take SQL Server and deploy it in container services managed by Kubernetes and so on.”

“For production use, we are saying customers can use the Linux images of SQL Server in Docker containers with some caution. We’re not suggesting people take their 500TB database and use a container for it but there are a lot of smaller workloads that people run in containers,” he said.

Windows support for containerizing SQL Server isn’t quite as advanced in RC1. “We have SQL Server in Windows containers as well; we’re working on that. On the Windows side, we recommend using it for devtest but not production yet; there are a few things where we still need to round the edges off and do some more work.”

Container support will interest even traditional Windows Server customers Petrossian said. “I think when it comes to convenience and the way developers work, as IT moves forward folks look at their peers and say ‘that seems like a simple way of doing it.’ If you can do it on Linux, why not on Windows?”

He compared cautions about containers to the early days of virtualization. “I remember when people said no-one is going to run a database in a virtual machine because of performance and so on. Of course, time has proven that wrong and people run VMs in databases all the time. You hear a lot of the same questions around containers: how is the performance, containers are ephemeral so what happens to the storage? All those things are fixed or being fixed or improving and we think containers will have a similar path to VMs in gaining adoption — but far more accelerated.”

And with this new release, SQL Server is no longer left out of this shift in IT infrastructure.

Feature image via Pixabay.

The post Microsoft Prepares SQL Server 2017 for Linux and Containers appeared first on The New Stack.

↧

How Microsoft Deployed Kubernetes to Speed Testing of SQL Server 2017 on Linux

July 20, 2017, 3:00 am

≫ Next: New Twilio APIs Can Help Developers with Authentication, Session Management, Data Synchronization

≪ Previous: Microsoft Prepares SQL Server 2017 for Linux and Containers

When the Microsoft SQL Server team started working on supporting Linux for SQL Server 2017, their entire test infrastructure was, naturally enough, on Windows Server (using virtual machines deployed on Azure). Instead of simply replicating that environment for Linux, they used Azure Container Service to produce a fully automated test system that packs seven times as many instances into the same number of VMs and runs at least twice as fast.

“We have hundreds of thousands of tests that go along with SQL Server, and we decided the way we would test SQL Server on Linux was to adopt our own story,” SQL program manager Tony Petrossian told the New Stack. “We automated the entire build process and the publishing of the various containers with different versions and flavors. Our entire test infrastructure became containerized and is deployed in ACS. The tests and the build of SQL Server [we’re testing] get containerized and we deploy hundreds of thousands of containers a Kubernetes managed cluster. That gives us faster testing, more parallelization, more efficiency, more automaton and a simpler process.”

Kubernetes is an open source container orchestration engine, which was originally developed by Google and is now managed by the Cloud Native Computing Foundation.

For the daily build, testing uses 700-800 containers. That used to be the nightly build test, because it took so long to run, but it’s faster now. “Each container runs through a set of permutations for a set of tasks then goes away,” Petrossian explained. “We can now make change to the code, build, publish the containers, deploy them and run through our daily test in a few hours. It used to take usually more than a day to go through. Now we’re doing more testing because the automation is a little easier for managing Kubernetes clusters than for deploying VMs, and we run through it faster so we have more time. If we really need to, we can do two or three daily builds, not just one.”

The improved density from using containers is even more useful when it comes to the weekly test suite, which uses 1,500 to 1,600 containers. “We’re doing more tests, faster and it’s costing us less because we’re deploying into ACS with a much higher density of containers per VM. On the Windows side, we used to deploy a VM with a SQL Server instance and it runs through the tests,” Petrossian said.

“We have a seven to one improvement on the Linux side because we deploy seven containers on a VM instead of having seven VMs each running a version of SQL Server and its associated tests. Each container running in ACS is self-contained and we can have many of them running on a single VM.”

Because so much of the code of SQL Server itself is the same on both platforms, the tests aren’t necessarily specific to SQL Server on Linux. A very large percentage of the SQL tests are completely OS-agnostic; they don’t care if they’re running on Windows or Linux. These are tests that exercise some internal component of SQL Server that has nothing to do with the OS, which is most of SQL Server.

“SQL Server interacts with the OS around the edges, for things like authentication, I/O and networking. We have tens of thousands of optimizer tests that are just pure SQL code with no system calls being made and those can run anywhere,” Petrossian said.

The Cloud Native Computing Foundation is a sponsor of The New Stack.

Photo by Jan Erik Waider on Unsplash.

The post How Microsoft Deployed Kubernetes to Speed Testing of SQL Server 2017 on Linux appeared first on The New Stack.

↧

New Twilio APIs Can Help Developers with Authentication, Session Management, Data Synchronization

July 25, 2017, 1:00 am

≫ Next: New Microsoft Azure Service Eliminates the VM Overhead from Container Launches

≪ Previous: How Microsoft Deployed Kubernetes to Speed Testing of SQL Server 2017 on Linux

Developers don’t need to be building a voice or messaging tool to find Twilio’s APIs useful. The company, which is most known for its communications platform, also has a wide variety of APIs that can help developers embed more functionality within their apps, including those for authentication, session management and data synchronization

Authentication

The new TwilioAuth SDK in the Twilio Console can be used to add push notification authentication, passwordless logins and approving in-app transactions to apps on iOS and Android.

Create rich interfaces like this that help users tell what’s a real authentication confirmation and what’s a phishing attack using TwilioAuth (credit Twilio)

With recent successful attacks on SMS-based authentication, the ability to handily add an authentication agent to your app is definitely useful.

If a developer only addressing Android, there’s a new Twilio Verification SDK for Android in developer preview, that works with the Google SMS Retriever API; this lets apps registered with Google Play Services use SMS to verify the user’s identity without giving them access to all your text messages — just the SMS the app has told the API it’s looking for.

Google and Twilio collaborated on this API to make it easier for developers around the world to use it without worrying about geographical numbers. iOS doesn’t allow programmatic access to text messages, so there’s no equivalent for iPhones, but this lets apps use phone numbers instead of emails for verifications that are harder to fake.

The SDK can spot phone numbers that are fraudsters using VoIP numbers rather than connected to real devices, and you don’t have to worry about users making a typo when they fill in their email address and never completing the signup process.

Engagement

This Twilio Authentication service is part of what Twilio calls its Engagement Cloud. This is a package of APIs for different ways to communicate with customers. It includes the Notify API to send push notifications, the TaskRouter API to connect customers to the agent in a call center who has the right skills or speaks the right language, and the new Proxy API for connecting a customer to a specific employee without sharing personal phone numbers that ought to stay private.

Lyft uses these services to connect riders and drivers. Morgan Stanley Wealth Management will start using Proxy to let you text message a broker without sharing the real phone numbers.

Using the Twilio Verification SDK for Android to sign up users without typing mistakes or fraudulent applications (credit Twilio)

Another customer is Nordstrom.

“Nordstrom wants the personal shoppers to be high touch and great experience. They want the shoppers to text with customers when they see something that would suit them, but they won’t want to give out the customer’s personal phone number,” Twilio CEO Jeff Lawson explained to the New Stack. Privacy is only part of that; it’s also about session management. “What if my usual personal shopper goes away? You want the next person to seamlessly take over the conversation. With Remind, they want to proxy the communications between students and tutors so it’s safe and secure, and they want to log and report the calls.”

Proxy doesn’t just handle the routing of calls and texts without disclosing the original numbers (and managing and load balancing a pool of phone numbers so it can provision the temporary phone numbers in real time to avoid queuing and latency, using geographically local numbers to keep costs down); it also includes session management and logging. A session is a JSON representation of a conversation between two people, which might cover multiple channels; if a text message and a voice call are both about the same order, you want them grouped into the same session and logged together.

Proxy uses Twilio Channels, so a session can include chat through the Twilio Chat API across a range of services, like a Facebook wall message. Developers can set the “time to live” on a session through Twilio Chat; a chat session can be closed when an agent has handled a support call so they can close that support ticket rather than having to stay in the channel, but the interaction between a buyer and seller in a marketplace could take days or weeks (and you can set the time-to-live to zero for a multiyear conversation that just keeps going).

A Twilio Chat service can be done automatically by checking if the user is reachable (they’re active on a device or they’ve registered for push notifications), or with timers; if a customer is chatting with an agent on a web site and they close the tab, the agent won’t know they’re gone but the backend can use a timer to close the chat and clean up the session.

The Chat API uses pre and post events to handle messages; the synchronous pre-events can trigger notifications or block events before they’re processed — perhaps to block a message with swearing or a credit card or phone number in — and the asynchronous post-events that occur after a message has been processed are useful for logging, or for triggering a chatbot.

Twilio’s Chat and Proxy APIs take care of delivery to whatever devices users are active on. To manage and sync state between users and devices, Twilio Sync has an SDK and REST API for Android, iOS and major browsers that allow to store, view and update state on devices, 16,000 at a time, using token-authenticated WebSockets, and bi-directional webhooks to invoke your backend and processing logic.

The 16,000 limit doesn’t stop developers sending more data; they can have multiple 16,000 sync objects, such as documents, lists or unordered JSON collections, or they may keep one sync object that’s a pointer an S3 bucket. Developers can use Sync to create collaborative or cross-platform apps, with each update to the app getting synchronized back and forth between devices, or for real-time applications like co-browsing, dashboards, route planning, tracking apps and anything else where you need to make sure that no state is lost — because the user can always look back at the state that was stored in the cloud to decide what information needs to be sent to a device (and devices that are offline store sync locally and resynchronize once connected). Twilio’s programmable chat is actually built on top of the Sync API, storing the state for all the message and user objects.

Using Twilio Sync to synchronize content in an app across devices (credit Twilio).

The Channels model will work well for Twilio’s Speech Recognition and natural language Understand APIs. Initially it’s for voice calls, especially to call centers, letting any developer create speech-driven interactive voice response systems (instead of making users press buttons in the dialer on their phone); it’s built into Twilio’s Gather API, so the options are those that make sense inside a call, and at the moment it only does 60 seconds of voice recognition. You can set timeouts to make sure people have finished speaking before you start the recognition, including pauses, but you have to specify which of the 89 languages and variants the spoken words are in. You can give the recognizer hints to boost the speech model, which handles general rather than specialized vocabulary; you’ll want to do that for names and number formats.

But, Lawson told us, “We’ll have more flexible ways to use it in future. If you build your natural language understanding for IVR and now you want it for text messaging or for an Alexa skill, you can reuse your models. And if Alexa or Apple or Google does something new next week, you can use that with Twilio.” And because these are APIs, you can use the recognized speech to drive other code; emailing a transcript to a customer, passing it into a form or using an intent classifier to pick out verbs, nouns and dates so you can make a booking or process an order.

Feature image by Joshua Earle on Unsplash.

The post New Twilio APIs Can Help Developers with Authentication, Session Management, Data Synchronization appeared first on The New Stack.

↧

New Microsoft Azure Service Eliminates the VM Overhead from Container Launches

July 26, 2017, 12:09 pm

≫ Next: Azure Container Instances Promises Cheaper, More Agile Container Tools

≪ Previous: New Twilio APIs Can Help Developers with Authentication, Session Management, Data Synchronization

Creating a container is fast and simple and can be done from the command line, but creating a VM to host the resulting container still takes time and usually means using a different interface.

Microsoft has addressed this potential bottleneck with the newly launched Azure Container Instance service, which can be used to create individual containers directly from the Bash prompt in the Azure Cloud Shell, or from a locally running Azure CLI. You can also deploy using templates from a public Docker repository, from a private registry or from Azure Container Service.

The company has also authored a connector for those who want to quickly spin up containers using the popular Kubernetes open source container orchestration engine, which is managed by the Cloud Native Computing Foundation.

“For any container deployment today you first have to have a VM that will be used to host the container,” Corey Sanders, Microsoft’s head of product for Azure Compute explained. “That amount of time to get started and that amount of work to deploy a container has now gone away with ACI. This pushes the infrastructure up a layer, enabling you to work with containers and no longer have to worry about the creation, deletion, patching and scaling of VMs.”

ACI offers per second billing, and the user can also specify the CPU and memory requirements for a container.

Launching a container directly by typing commands into the Azure Cloud Shell (credit Microsoft).

The ACI service is in public preview from today in three Azure data centers (U.S. West, U.S. East and West Europe) though that will expand over time. Initially, it’s for Linux containers, with support for Windows containers coming “quickly, in the next couple of weeks.”

Yoooo! This is slick, y'all! @Azure Container Instances.

Per-second billing, Public IP’s, One command launch.https://t.co/zp3Y1FGNuW pic.twitter.com/TeguUjCt9V

— Ashley McNamara (@ashleymcnamara) July 26, 2017

Kubernetes Connector

For those who want to use the open source Kubernetes container orchestration engine to deploy container instances with ACI, Microsoft has released an open-source connector ACI Connector for Kubernetes, which can be used to build applications that use both VMs and container instances.

Azure Container Instances = low level infra for people to build on. Example: We've built a Kubernetes connector:https://t.co/cIYTz1QHxi

— brendandburns (@brendandburns) July 26, 2017

“This enables on-demand and nearly instantaneous container compute, orchestrated by Kubernetes, without having VM infrastructure to manage and while still leveraging the portable Kubernetes API. This will allow you to utilize both VMs and container instances simultaneously in the same Kubernetes cluster, giving you the best of both worlds,” the GitHub page for the project asserts.

Microsoft is also joining the Cloud Native Computing Foundation, as a “platinum member” with Gabe Monroy, lead program manager for containers on Azure and former Chief Technology Officer of Deis (which Microsoft acquired earlier this year) becoming a member of the governing board.

This support makes sense given Microsoft’s ongoing involvement in container development, with contributions to Kubernetes, Helm, containerd and gRPC, and its own open source Kubernetes tools like Draft. Sanders emphasized the importance of containers to Microsoft. “Containers are changing the way developers develop their code, they’re changing the way apps are deployed and the way system administrators manage their environments.”

Cloud Native Computing Foundation is a sponsor of The New Stack.

Feature image via Pixabay.

The post New Microsoft Azure Service Eliminates the VM Overhead from Container Launches appeared first on The New Stack.

↧