Archive for the ‘Information Technology’ Category

Configure VPN to Azure Network

June 1, 2017

Microsoft Azure still lives in two worlds, the classic portal and the Resource Manager (RM) model. This is similar to AWS except AWS integrated and retired the old one. The new account I created in the Azure, has all assets in the RM model.  Microsoft is known for its UI and they are pretty good compared to AWS Management console.

That  I have all my assets in Azure in the RM model made life easier in some and slightly painful to do some such as setting up VPN. I wanted to enable VPN access and was looking to setup a Point to Site (P2S) VPN. In my earlier post on “Securing Cloud assets“, I mentioned different mechanisms (P2S, S2S, ExpressRoute etc.). Here we will see the simplest one P2S in detail. Every Azure documentation was pointing to using the PowerShell rather than the portal (UI). The documentation given here in Azure is pretty complete.

https://docs.microsoft.com/en-us/azure/vpn-gateway/vpn-gateway-howto-point-to-site-rm-ps

I got nothing against PowerShell, but for users of Microsoft products, they are used to the beautiful and elegant UI. I started the journey to do this using the portal and I was successful in doing so but for couple of cmdlets.

First things, first, I followed the instructions in this link to make sure the PowerShell is working and that it has the right version of Azure cmdlets.

https://docs.microsoft.com/en-us/powershell/azure/install-azurerm-ps?view=azurermps-4.0.0

All I had to use was login from Powershell and the rest of it until generating the root and client certificate, I did it from the portal.

Login-AzureRmAccount

To generate the root and client certificate, you need to use this cmdlet

New-SelfSignedCertificate

The link below, clearly states the process to generate and export the certificates.

https://docs.microsoft.com/en-us/azure/vpn-gateway/vpn-gateway-certificates-point-to-site#rootcert

For simplicity sake, I did not have a loadbalancer, fron-end and back-end vnets. I had only One server instance with one vnet and I created another vnet for  the VPN Gateway. Once you are done, creating the vnet Gateway using the portal, you need to goto virtual network gateway, click on the Point to Site configuration and add the root certificate (the one I generated by using the cmdlet and exported using the certificate manager)  by opening the .cer file and copy the content between “—Begin …—” and “—End…” (not including those two lines) and paste it in the portal. If the IP range is missing, please add that as well. Finally, go to the top and download the client package.

Once downloaded, install the package and then goto “Change VPN settings” in your windows 10 PC/laptop, you will see your new VPN. Click and connect and you are all set. One more important item to remember after you connect the VPN is to use the Private IP of the server to RDP and not the public IP.  You can find the private IP of the server in the network interface section in the portal or you can get it by connecting and doing ipconfig.

This is a demo, the secure way is to use the loadbalancer,  define a new port for the RDP and do port forwarding from the loadbalancer to the server. The VPN should be set to connect through the loadbalancer.

Searching PDF content – AWS CloudSearch

April 3, 2017

In my earlier post, we saw that how we can use AWS CloudSearch to make data in our RDBMS easily searchable. I read through the documentation that AWS supports variety of other document formats such as PDF, Word etc. I tried to figure out how do so with content of PDF files. This information is not readily available in any documentation offered by AWS or on any forums or blogs.

This was dead on arrival. The first step, I tried is to use the “upload documents” feature in the domain dashboard in CloudSearch. This successfully uploaded the file successfully and it asked me to run indexing and I did. However, when I searched for any words that was in the PDF file, I got nothing. I attempted to repeat the steps and I reviewed the data it generated before uploading. To my surprise, I found that the “content” field was in binary! No wonder, it didn’t work. Then, I started my research and noticed two other people posted to stackoverflow with the same question but no one had answered. I posted to AWS forum but no answer came by for two days. [ I answered my own query at the end of it – nice way to score some brownie points :-)]

Started researching on competing products – ElasticSearch, Solr and other and the documentation was stating that we need to encode it to Base64.  Thus, I tried converting it to Base64 and uploaded to AWS, no luck and it didn’t work. No matter what I do, it was not working. I was re-reading the documentation on AWS again and decided to try their CLI  command cs-import-documents. This is the equivalent of the “upload document” wizard on the command line. We can upload directly to the domain from the CLI using this command. However, my attempt to upload directly gave a fatal error. Then I used the –output option to generate the SDF (Searchable Document Format) file for the PDF I wanted. The tool generated a SDF JSON file. The content was all text and it looked as follows

[ {
“type” : “add”,
“id” : “C:_Downloads_100q.pdf”,
“fields” : {
“content_type” : “application/pdf”,
“resourcename” : “100q.pdf”,
“content” :  “The is the content of the pdf”,

“xmptpg_npages” : “11”
}
} ]

I uploaded this json document using the “upload documents” button on the AWS CloudSearch console. Then used the “Run Test” interface and did a search and voila! it found my pdf and displayed the content of the content field!

Though, I got it to work, here are some limitations of it as such. We already saw that each document cannot be more than 1 MB in size and the batch cannot be more than 5MB. Thus you need to chop your PDFs into multiple smaller files, then run through this CLI utility or use one of the content extraction tools such as Apache Tika, or other similar, create a JSON document and then upload (It would be nice if this CLI tool automatically split them as 1MB and uploaded or generated the SDF files). The other thing is that it returns you the entire content and not the matching paragraph / page number or any other similar additional information. However, based on the document name, you can get the PDF and search and show the PDF the way you want it.

Here is an article on how to do this on Elastic search. https://www.elastic.co/blog/ingesting-and-exploring-scientific-papers-using-elastic-cloud

I already signed up for ElasticSearch service, hope I can try this out on that, However in case if you did this on your own, do share your experience and give me a pointer. I want to know your thoughts.

AWS Cloudsearch

March 25, 2017

The cloud is filled with wide ranging options to store and retrieve data and so is on-premise. Every cloud provider has their own cloudsearch solution from Amazon to Azure to Google. In addition to these proprietary solutions, there are these open source platforms ElasticSearch, Apache Solr etc.

Here is a wonderful blog comparing the three products. http://harish11g.blogspot.in/2015/07/amazon-cloudsearch-vs-elasticsearch-vs-Apache-Solr-comparison-report.html

In short, all offer similar features with little difference. I would say there are mainly two big differences between AWS CloudSearch and the other two.

  1. Data import is a batch process in AWS CloudSearch. If you have streaming data or immediate data update, the go for elastic or Solr.
  2. If you don’t need to worry about infrastructure, backups, patches, then go with AWS Cloudsearch. Out-of-the box it comes out as a true cloud product.

Elastic.co as well as AWS offers elastic search as a service where they have simplified the infrastructure part. Elastic.co, infact offers it as a service on AWS cloud. However, Elastic and Solr are more popular than CloudSearch. Thus, it is easy to find resources online for these two compared to AWS CloudSearch.

Thus, I embarked on a journey to take-up AWS CloudSearch and you know what, it is not that difficult (though I went through those gnawing issues and had my own share of frustrating moments). To begin with, I did the manual route of extracting the data out of my RDBMS (SQL Server), upload the data to CloudSearch, indexed it and used the rudimentary UI provided by AWS and was able to search in an hour. The biggest advantage, I see with AWS CloudSearch data upload is that it takes a CSV file and converts to JSON by itself. You can write a batch program to upload in chunks of 5MB files. In addition to CSV, it support multiple other types such as PDF, EXCEL, PPTX, docx etc.

Both Solr and Elastic search, you need to provision a Linux server, then install and configure similar to any software that you download.  Even if you take the service route, you still need to worry about backup, upgrades, applying patches etc. One big advantage of these is that you can have it on-premise as well, while AWS CloudSearch is truly available only on the AWS cloud. Beyond that, Elastic also has data visualization tool Kibana or it comes like a suite (ELK – Elastic, Logstash, Kibana). AWS ColudSearch offers only indexing and search and no visualization which is a separate product Quicksight (I haven’t looked at this but I plan to).

I will write more about the programmatic approach in my next entry. Please drop me a line, if you can’t wait and wish to see it in action!

 

Securing cloud assets

January 23, 2017

In this post, I’m going to point out the ways in which you can access your assets (VMs, Storage, DBs) in the cloud securely. This is no different than accessing the assets in your corporate network. The goal is to make you feel comfortable about having your assets in the cloud.

To accomplish anything meaningful, you would need at least one server, a VM (if you are going IaaS) and may be storage as well. The basic level of access restriction can be achieved with AWS IAM and Azure AD. You can restrict and control access to your storage and other assets with different users.

The first and foremost activity is to form a network (virtual) and keep your servers inside that network. A network helps you to group your assets, so all security actions you take can be applied equally on all the constituents rather than doing it on individual assets. Forming a network also enables / helps in free flow of access among its constituents (like an economic bloc). Now, this network need to be protected from the rest of the world. The Security groups will help you define the rules on enabling and blocking access.

Now that you created a network and setup rules, since this whole network lives outside your corporate network location, you need access to it first. There are many ways to accomplish this. The basic and rudimentary approach is to restrict access to this virtual network in the cloud to specific IP addresses (location 1, location 2, etc). However, in this method, though the access is limited to specific IP addresses, the traffic is over the internet and there is no security.

The next level is setting up a VPN.  If you are in development mode or access is limited to specific small set of individuals you can setup Point-to-Site (P2S in the Azure world). However, if you want your entire corporate network to be able to connect to the virtual network in the cloud, you can setup Site-to-Site VPN. In both scenarios, the traffic is encrypted but you still are going over public internet. Thus your speed, latency, SLA all limited to the bandwidth & SLA of your ISP. If you are not happy with that, you can try something in AWS it is called Direct Connect and in Azure it is Express Route. This is not public internet but a dedicated pipe, you can call up providers like AT&T, Level3, the cable companies, they will be happy to provide. In addition to this, most of the Datacenter providers such as Sungard, Datapipe, IO offer direct connect between their datacenters and AWS, Azure and other cloud providers.

You can see, how this is no different than working from home and connecting to your corporate network. If you have enabled your employees to work from home, then you are ready for cloud.

Browser Push Notifications

January 6, 2017

You might have heard about push notifications for apps (Android, Apple, Windows – for completeness sake :-)), but to bridge the gap, browsers are coming up with support for push notifications. Currently Firefox and Google Chrome support push notifications. You might have seen something like this recently when you visited a certain website. for eg: one from economictimes.com

notify

It is a pretty cool feature. With email becoming ubiquitous and gmail tagging emails under “promotion”, you need a way to reach out to your customers on significant events.  I can state a laundry list of scenario where this will be useful.

Let us say,

  • You booked a flight and checked-in, the gate changed
  • You booked a ticket for an event, you want to show parking tips or parking coupon
  • You have a business website, wherein a crucial information is made available and the client need to be notified.
  • customer left items in the shopping cart, show promotional or price change alert

and more. Here is an introductory video on this

https://developers.google.com/web/fundamentals/engage-and-retain/push-notifications/video

Yes, you can email, SMS  your clientele but in the world of diminishing attention, you need to grab as much an attention you can get.

Please drop me a line if you are interested to discuss or interested to implement this for your business.

Is your apps “Cloud Ready”?

December 11, 2016

It was mainframes before the 90s, then came the client-server architecture and the web followed that in 2000.  The web architecture is somewhat similar to the mainframe architecture. I started my career in client-server, went to the web and now I’m in cloud. The technology departments in the corporate world and the IT services firms assisted in the migration from mainframe to client-server to the web. One need to change the mind-set while designing apps for specific technology platform. It will be easy to understand if you look at each of these technology framework as a genre. The current cloud architecture actually resembles the client-server architecture to a great extent. These days, the apps are feature rich with majority of the processing happening on the server side while still rendering logic on the client side (AngularJS website, apps on your phone etc.)

The are multiple cloud offerings, I mean IaaS, PaaS and of course there are many vendors Amazon, Microsoft, Google, Rackspace, Openstack and the list keeps going. You can have private cloud, public cloud and hybrid cloud. Now, everything you need is provided and is available as an app. The concept called SaaS which started in early 2000s with the advent of websites.

Let’s talk about migrating your applications that are currently living in your data centers or co-hosted or on your own premises. The easiest way is to follow the IaaS path wherein the physical servers are replaced with the VMs in the cloud. I’m saying this as easy because if your servers are currently in a data center you are any way remotely connecting to them from your personal device (PC, mac, Chrome – whatever), it doesn’t matter whether the remote server is in your DC or in Amazon or Microsoft DC. Apart from this, your applications doesn’t need to undergo any change, pretty much. However, this will not let you take advantage of “the cloud” offerings.

The second approach is to build your apps and make it available as SaaS by using the PaaS. This is where you will reap the benefits of the cloud architecture. Wait, you will hear from everyone, oh you are going to get locked up with a vendor! This is true if you were asking me about this even three years ago. Now, the cloud offerings have matured and one approach you can follow is to use containers such as Docker.

Let us take a hypothetical app where you receive data from your various vendors, the app need to load the data, notify vendors / your IT / business on failures, notify customers of the changes and that the latest information is available on the website for everyone to access. In such an app, in the current world (hosted in a physical or virtual server in your DC), you can have your own FTP server to pull the data from vendors and store the files in the file system of your server; with SMTP service running in your server to send the notification emails; any failure files can be moved to an error folder and error logs written to a log file in the log folder; Once the data is valid, you can update your database which forms a separate cluster in your network. If such an app/ website need to be hosted on the cloud, it need to be made “cloud ready”. What does it mean by “cloud ready”. Some of the salient features of a “cloud ready” apps are being machine agnostic, self-aware, secure and more. We will see the top ones here.

  1. Machine agnostic. – some examples
    • No file system – store files on AWS S3 / Azure cloud storage/ drop box [if you really want one, try the AWS EFS, it’s a network file system and you can use like NAS]. If you have a server, you will have a file system, but you should avoid them as it will have issues when you scale up and scale down. In Azure, the local C drive is temporary and you lose all data on it on reboot (other than the OS).
    • No local ftp / SMTP – use other mechanisms or 3rd party [FTP is a serious security issue], use AWS SES or other providers for email.
  2. Self Aware
    • Apps should be service oriented and the service endpoints are fetched from the application config files.
  3. Secure
    • Secure communications – Use SSL and encrypt all communications especially if you are in a regulated industry (eg:HIPAA). if you are app is calling other allied service end-points pass encrypted data and follow authentication and authorization policies. This is something often discounted when you are writing apps for your own DC though it’s a recommended practice.
    • Encrypt Data @ Rest- Don’t keep PII (Personal Identifiable Information) but if you need, you should encrypt that stored data. This includes any document / form data that you collect, create.
    • if you let users upload to your S3, please use proper authorization policy and move them out into a secured bucket ASAP.

In addition to these, there are more and my intention is to bring your attention to this and not to write a thesis. You are always welcome to write to me, if you want to discuss more.

Getting your apps to the cloud opens up possibilities,  more tools & technology at your disposal. You can easily start notifying your clients through SMS (trulio), you can cache the data if you are using GCE, you can use Firebase / AWS cahching to synchronize data and provide a seamless experience irrespective of the device form factor. Ultimately, your application will be device agnostic and your apps can follow your user from phone to PC to TV.

These thoughts are not just for migrating, you need to keep this in mind for newer applications you are building. Feel free to drop me a line if you need help migrating legacy apps to the cloud.

Server-less Architecture

November 26, 2016

Using AngularJS & Firebase

There are lot of write-ups on the pros & cons of server-less architecture. How it creates vendor stickiness, high TCO, it is restricted to certain technologies, it is not a panacea. I’m not going to get into that argument. One thing, I will buy into is that “it is not a panacea”, everything else is debatable. There are many ways of achieving the sever-less architecture and the one we discussed earlier is AWS and the lambda functions. Yes, it is still a choice but IMHO, it takes orchestration of so many individual components of AWS. You need Lambda, API gateway (to make it REST), Cognito or other Auth providers, DynamoDB or other Database and more.

Google has come up with a platform called Firebase.  It is so simple and concise. What initially started as a JSON database has evolved into a full-fledged platform that encompasses all AWS components that I mentioned above. It can host, it can offer authorization, it can authenticate with 3rd party providers such as Facebook, Google, it can store data and if I missed something, you know what? it can do that too 🙂 (you still need to develop the fron-end website) The biggest drawback is that working on Firebase makes you equivalent to an astronaut on the ISS doing a spacewalk! Yes, remember ISS moves in a geostationary orbit @ ~17k miles an hour? well, Google is not that fast! but they keep constantly updating the platform in weeks if not hours. That is the constant challenge. If you need help and came across a video that is a year old, well – forget it. You need to find something that is not more than six months old at the max.

The simplest and fastest way to get up and running in production is to use AngularJS for the front-end and Firebase for the back-end. For most apps, the free version of Firebase is good enough. You can right your code in AngularJS, minify using tools like gulp, google offers a Javascript SDK / library called AngularFire to interface with Firebase from Angular. You need to do few minutes of configuration on the console for enabling authentication, authorization, you can copy the configuration details (like db connection string) and no need for any secret key / password to be stored on the client side, all authorization is done on the server side. Thus you don’t need to worry that your secret keys are getting exposed.  Once you develop the web-app in AngularJS, minify using gulp, you can host the website either in Firebase or AWS S3 (How to host on S3).

If you are a startup, you want to validate and get an MVP out quickly, this is a great platform. You can even continue to run until you reach a critical mass and if you feel any bottleneck, you can afford to build on any platform. Google platform can scale and where I see problem is if you want to download your data from Firebase, it gets tricky. Even if you are big corporation and need to quickly get a web-based app or need a website to manage an event that is coming up or anything that is not going to run for years and that IT says it will take years to build, I will say nothing can be cheaper and faster than Firebase.

If you have thoughts and want to discuss, I’m always open, drop me a line.

AWS RDS & Azure SQL Some updates

October 5, 2016

I’m continuing to explore cloud based SQL Servers (AWS RDS as well as Azure SQL) vs. SQL server in the cloud. There are some good developments that happened recently. Round one Amazon AWS RDS wins against Azure SQL.

In my earlier post (Amazon RDS DMS), I mentioned that these cloud providers should come up with a mechanism to Backup / Restore to/from .bak in S3 or Azure blob storage. I wrote this in April and AWS made it possible by end of June while I was busy playing with Azure. Here is the documentation on how to do this.

https://aws.amazon.com/blogs/aws/amazon-rds-for-sql-server-support-for-native-backuprestore-to-amazon-s3/

Azure has come up with something similar but fell short. They introduced a way to Restore from a .bak file in Azure blob storage. If you have your own instance of SQL server 2014 & above, then there is an option to Restore from “URL” where you specify the URL of the Azure Blob storage details.

https://msdn.microsoft.com/en-us/library/dn449492.aspx

However this doesn’t apply to Azure SQL.

On other fronts, Azure SQL offers other features such as Full-text in complete form. Here is the link to the documentation on enabling / using Full-text feature.

https://azure.microsoft.com/en-us/blog/full-text-search-is-now-available-for-preview-in-azure-sql-database/

The reason, I’m providing the links rather than explaining is that the documentation is really good and you should not have any problem following these instructions. It is pretty much straightforward. I had trouble with the “Restore from URL” because, I was thinking this is the equivalent of the AWS RDS feature. I was trying to restore an Azure SQL database from the Blob storage and getting frustrated!

My biggest pet peeve with Azure SQL is that majority of the features cannot be done using SSMS. It is a wonderful tool that makes SQL Server stand apart. You can’t right click to manage full-text or indices, keys etc. I’m happy that MS is following the open source crowd in warming up to the community but it should do all these without giving up its USP – the wonderful tools. Without these tools such as Visual Studio, SSMS, you can as well use the open source tools.

Please do drop me a line if you have trouble accomplishing any of these or want to reach out to me to share your thoughts.

To Upgrade (or not?)

August 16, 2016

We see all these attractive things in life. we want this, we want that, do we really need? we don’t ask that question. That is the ONE question and the OTHER question is what would it be if we don’t get that? The answers to these two questions pretty much sums up your next course of action.

However, often times it is not as simple as that. If you have a 10 year old Toyota Camry and you want a Mercedes 300, it’s one thing but what if you have a Windows XP PC (yeah!, you read it right) and it works just fine @ work or @ home and you need to upgrade to Windows 7 or 10 (Please don’t think of 8).  If it is @ home, at least you are the only one that is exposed but if it is your work, your exposure is whole lot and not just financially but also the number of people it will impact.  I’m going to focus on running unsupported software here.

I come across this scenario a lot and especially recently. The world has come across the fact that the 800 pound gorilla that has woken up (yes, I’m talking about Microsoft). They have started churning up software that is inline with current global market and to address the current global risks. In thinking through the process, there is no golden rule or a magic wand to figure out, is it time to bite the bullet and upgrade. However, we can come up some guiding principles or set of questions to help us in the decision making process. Based on the experience, interactions with our clients, feedback, industry pundits, I came up with a set of guiding principles. (Again this is just a sample.)

  • Are you in a regulated industry? Is your systems conforms to the govt. regulations
  • Do you have a corporate governance committee? Is your systems conforms to the governance rules laid out?
  • Is there and industry standards body that you are part of? Is your systems conform that (basically you are not just preaching but walking the talk)
  • When did you last perform an upgrade? How many versions are you behind?
  • Any (how many) of your systems use software that are not supported by respective vendors?
  • Are all customer facing systems are running supported software?
  • What is the cost of not performing an upgrade? What is your plan when Sh*** hits the fan or things stop working?

If you are in a regulated industry, you don’t have an wiggle room, you need to follow the govt. regulations. You need to be honest with yourself, thus you should be conforming to your own standards. These are rules that you laid for yourself that you should follow (remember those new year resolutions of going to the gym? – well this doesn’t belong in that category). You are part of a world / industry standards body and you need to stick up. if you are not then who will? Then comes when did you last perform an upgrade. I get it is a chore and too much impact etc. But if your answer is half a decade or more, you have reached a point where the cost of inaction is more than your action. If or how many are using unsupported – if the answer is more than ZERO and they are being used in a customer facing environment, you probably don’t have “a plan” for the last question. If it is the internal systems, you will be able to setup more security rules, un-maintained servers a.k.a – you got a parachute. However the last question is the most critical of the lot for your business. When you do this, you are not just jumping of the plane, but you are jumping off the plane with no parachutes (for anyone in your organization). Depending on how critical, you can make the jobs obsolete, in turn the employees and in result you own organization. Remember we read about stuff on the newspaper and other media about such incidents? Do you want to be one such example?

IMHO, the ideal upgrade is Three years and not more than FIVE. Beyond which, you will get into the problems of unsupported software, security, obsolete technology, rusted workforce and more.

What are your guiding principles? Drop me a line.

Microsoft Open Source

August 8, 2016

Yes, You read it correct, I hear you, saying, isn’t it an oxymoron? Well, it is not.The world is abuzz that ever since Satya took over Microsoft, things started changing. It is more developer friendly, it is more open to new ideas, it is collaborating with various vendors, partners yada yada yada…

This new era started with Microsoft embracing Linux, letting you create Linux instances in Azure. You all know that Visual Studio is free to the entire world. I have been associated with technologies in the past couple of decades, one of the best tool for developers is Visual Studio, hands down (No offense to Eclipse, Sublime and the like). They used to do this before calling it developer edition, then called community edition, trial edition. These various editions are either teasers i.e the features you want to use are not available and that you need to buy the paid version or it will be available only for 90 days or some limited period. They have gone out of that mindset and said, well there is going to be one edition an is free.

The best development tool in the world, the one that is very well integrated with the Azure cloud that lets you manage, create, deploy the assets is free now. If you thought that is not enough, recently, Microsoft announced that they are giving the SQL Server Developer edition Free. This is huge! I remember struggling with the so called SQL Server Developer edition couple of years back that it doesn’t have analysis services, integration services, SSRS, the SQL Agent. Now all that is available. So what is the catch? Nothing. This edition is exactly same as the SQL Server Enterprise Edition. The download link is here. You just need to signup / create an account with VS essentials. The only limitation is that Microsoft says, you cannot use Developer edition in any production environment. At least, you don’t have to worry about purchasing when you are developing your product. You focus on developing the product and when it comes to deploying or taking it live, you can worry about it. Even then, both Microsoft and Amazon offers you to participate in various “Start up” initiatives that you an benefit from.

I hear you asking, so what brought about this change? IMHO, this is how Microsoft was operating from inception. They probably forgot for a while when Steve was at the helm. It was Novell Netware (you guys remember?) that pioneered Local networks in the corporate world, then came Windows NT. You all remember what happened to Netscape? Well, Microsoft offered Internet Explorer free. We didn’t get anything free for a while and now it started again. You get Visual Studio, SQL Server Developer, Xamarin for mobile apps, the start-up initiative, Linux.

Don’t over analyze. Start working on your next idea and the ecosystem is available for you to use the best tools out there. What are you waiting for? Microsoft is Open to sourcing it from / to you. Happy developing.

PS: The complete licensing guide for SQL Server 2014. if you are in doubt, don’t hesitate to drop me a line.