SRK-TechBlog

Technologies Blog

Generative AI - Three Levels of Expertise

Generative AI - Three Levels of Expertise

 

There are three levels of understanding of Generative AI,

Level 0 - Knows the term "Generative AI" and does not understand weather its yet another cosmetic term or its for real

Level 1 - Knows that business impact and possible business use cases - GPT (Generative Pretrained Model) is the biggest innovation we have in NLP domain in decades whereby machines can do some form of reasoning using LLM trained models which are trained over massive amount of data using expensive infrastructure(GPUs) and time factor involved. Plus this reasoning can be combined with RAG and FINETUNING Methods to bring in custom knowledge base as well so that to customize the technology to your knowledge domain.

Level 3 - Knows at the research level on how this Transformer part works in GPT to devise reasoning and what are hallucinations and what are the limitations of this technology and what areas are being explored and what are the constraints in current models.

By all means GenAI is game changer, any task you do can be boosted with GenAI,

  • CODING: boost your productivity for normal reasoning tasks: ask the GenAI to generate a for loop in python 10 levels deep where each level should use iteration depth as variable name and print the value at 7th depth and see it created for you ! save time and some part of the your thinking capacity as well. 
  • PRESENTATION / CONTENT WRITING:  ask GenAI to generate rephrase the document (better copy paste-r), ask it to generate the entire document. 
  • GRAPHICAL: again generate images using GenAI or modify existing ones (you can expect soon to alleviate common 

Where do you stand in your GenAI Journey?

Video Go through for using existing LLM with RAG method: Link

Python vs C# - .Net (Core)

Python vs C# .Net (Core)

Often times we see new people struggling weather they learn python or .Net, the best answer is to learn both but when making a choice below are few considerations.

.Net is very useful for Enterprise software, its compiled and statically typed contrary to Python which is interpreted and dynamically typed.

Benefit with Python is less to worry about types, same list can contain string items mixed with decimal types as well as integers. 

But not the case with .Net where you have to decide these in advance. So with Python, its possible to rapid development but for enterprise grade speed you need .net

So you can quickly develop a prototype using python but when it comes to enterprise, use .Net.

Python is the only language that help you compute permutations and combinations with available library, so you get an edge in software competition with that, doing the same in .Net requires a nuget package, something not allowed in most of the portals used in software competition.

Python is your friend for quick easy analysis of data doing some trending and machine learning learning analysis quickly

Hopes this helps add in your decision when making a career learning path. Have a good day.

Workflows and when to use them

Firstly What are Workflows,

As per how Microsoft defines it:

Windows Workflow simplifies the authoring of long-running reactive programs by providing:

Activities that access external input.

The ability to create Bookmark objects that can be resumed by a host listener.

The ability to persist a workflow’s data and unload the workflow, and then reload and reactivate the workflow in response to the resumption of Bookmark objects in a particular workflow. Source

And when to use them,

It seems from above that workflows give you ability to pause and resume the execution. But! that won't be true, because you could also pause background threads and resume them, so sayings Workflows are doing some magic there is not true. Only reason you would want to use them is to give users the control to reorder execution or add more tasks by just drag and drop of activities so non technical end users can benefit from your complex blocks of code.

When not to use them

Most of the time workflow is just a fancy word for you. Its observed often workflows are implemented where activities are not really resume-able, and shuffling them really breaks things, and your order of execution is not to be changed by end users but it will be the same mostly and you can make changes to execution in code when you need, then be assured workflows are just adding to your complexity.

it is often thought that SLA based tracked execution, logging what's done, or concept of transactions(ACID) is what you get when you use workflows. That is as well not true, you can have them all with same amount of dev efforts otherwise as well.

Also see: When to Use workflow engines?

Ransomware - Why end up paying ransom when you can timely protect yourself

Rise Of Ransomware

The rise of bitcoin and digital currencies have certainly opened the door to new challenge, bad guys have a way to make you pay them. Your vulnerable systems and infrastructure can now allow others to blackmail you. Even worse, once they are able to attack you, you can't simply get rid them of them, even worse, the vulnerabilities they tap, are not something you can fix in days, so most often you are totally helpless. This is one benefit of cloud infrastructure that you don't have to worry about its security but still your laptops and desktops and data that resides on them is always your responsibility.

How To Protect

The best defense is only possible when you have some idea of how is the attack possible. When you click on the URL links, from the internet or on the emails that you receive, you hackers the chance to get into your world. Those links seems legit and might open up and look normal but under the hood something malicious is planned. They make you download something which when you execute, you machine is then totally under the control of someone else. So your laptops and desktops on which users and checking emails and doing web browsers are the main sources or origin points from which this malicious things starts its working.

Your first and foremost, wish and priority at then, would be to stop this from spreading out to all rest of machines in your network and even worse, the servers that contains all the important data of all the users. So how does the spread happens. Normally when you want to copy any file to remote PC, in case of windows machines (which are mostly the target of this ransomwares) you need SMB protocol that open share folders on other machines, you need network path enabled between them and you need credentials to access the machine. In 2017, WannaCry Ransomware tapped the vulnerability present in SMB version 1 itself, which caused it to open the share folders of other machines even without passwords, which lead to lot of damage and then followed by patch from Microsoft which can totally block SMB version 1 from the client machine as well as server essentially blocking capability of windows machines to talk to each other on this vulnerable version of the protocol.

So your first defense of course is have proper patching in place and ensure old version of SMB i-e SMB 1.0 is totally blocked properly in your network. But is that enough to stop the spread? definitely not at all. Remember we discussed, all you need to reach out to other machine, is SMB protocol, network access and credentials and that's how it need work. Now lets suppose you are logged into a windows machine as administrator account which is common account to administer more than one machine, then obviously your current logged in credentials are sufficient to let the thing spread. So obviously in order to protect you need to follow the principle of least privileges which says, you should use least privileges to get the job done. So let's suppose you need to administer just the file servers, then have an administrator account specific to file servers and don't use super accounts like domain admins. In fact all those common accounts with access to almost all infrastructure should be super safe and used only and only when necessary. Same applies to common administrator account passwords. Suppose you have all desktops with the administrator account password same and let's suppose you are logged into one of them with admin ID, and its also makes the spread super simple. So you need to ensure there is no commonality across the machines in terms of credentials. There are many solutions like Microsoft free Solution name "LAPS" which rotates the local admin password on schedule.

Your next important step need to be segmentation on network level. Remember we discussed, the need to spread requires, network path also enabled. So let's suppose desktop is able to connect to every other desktop in the network, than that's a problem. Normally there is no need for desktops to talk to each other. Mostly these servers are talking to servers which is the end point of these desktops. But good network administration means, you segment you network. Particularly the ports that are used for SMB communication that help transfer of files to other machines, (normally port 445) need to be blocked from network. That is one smart thing that you can do in advance. You can also extend the idea to implement the org wide policy to let not PCs copies file from each other rather they are able to use only a file server for that purpose. Same idea can be extended to servers. Not all servers need to have SMB communicated required to every other servers. So high level of segmentation is very important here. Atleast one dept. of PCs should be totally blocked from reaching to other dept PCs. For server also, do extensive segmentation especially for this type of communication.

I recall doing security assessment for one Organization, they were using common credentials for all database servers service account which had domain admin privileges also. That surely is call to disaster. Because then these accounts you can't even change them easily. Changing the password for such an account is not easy also, because you have to break so many things that are already tied to that password. So that Org had to be in fix even after disaster for long.

Implement a good backup strategy and ensure that backups are kept on different type of technology. For example if you have tasks scheduled that backs up data on file servers and let's support file server is compromised, what good would be that backup for you? the backup files would be another useless data for you. Dedicated attached storage to backup solution / servers which is not shared directly on same network is type of thing you should look for. Especially the communication between backup solution and rest of infrastructure should be planned in a way that backup and restore can continue to function.

So to put together use below to prevent,

  • Make sure Windows and all software are up to date at any time. Regularly patch your machines(servers and desktops) and Isolate obsolete and unpatched machines at network level.
  • Credentials you have to plan them well. Remove all common accounts from common usage. Segment the accounts. Consider implementing the "Local Administrator Password Solution" (LAPS), if your local administrator account has the same password on all client machines: https://www.microsoft.com/en-us/download/details.aspx?id=46899 
  • Ensure weak SMB 1.0 protocol is properly blocked and ensure later SMB protocols v2/v3 are also properly blocked on the network wherever file sharing is not required between machines. 
  • Consider implementing MFA for all admin accounts (although MFA does not block the spread, but its good security measure to not let bad guys in).
  • Consider using Microsoft Advanced Threat Analytics:  https://docs.microsoft.com/en-us/advanced-threat-analytics/what-is-ata  (already in list above)
  • Monitor your firewalls (at boundaries and internal subnets) to identify machines generating suspicious traffic, like network scans, enumeration requests, or even exploit code usage. (to be addressed by relevant team). Consider enabling windows firewalls.
  • Make sure relevant log retention is in place on your proxy/firewall.
  • Backup your security audit logs and increase their sizes so you can later do forensics. Consider a centralized security logging solution.
  • Train your users (spear-phishing, Social Engineering), (like share this blog post with them).
  • Use the antivirus which provides best ransomware protection. Some even offer free decryptors for common and new ransomware attacks.
  • Backup Strategy, need to be planned well in advance for such situation in mind

So the best defense is of course possible with prevention. Doing all above is not possible at nick of time when you are already under attack.

There are certain things you should do, once you already affected,

  • Disconnect infected machines.
  • Reset twice compromised account passwords.
  • Reset twice local admin passwords especially if passwords are used on several machines to avoid lateral movements. 
  • Reset twice the password of privileged accounts in the domain (including service accounts).
  • Kerbtgt account password reset can be done for mitigation and monitoring purpose (do not forget to reset it twice) but it has to be planned preferably with a Microsoft experts/Vendors like us.
  • Schedule a full AV scan on your machines.
  • Please confirm that your local admin passwords are different and can’t be used to do lateral movement (LAPS usage). (local admin password need to be changed manually and on case basis before bring server on network). (Please consult with us if you need deployment of such solution)
  • Consider re-installing client machines used to administer any type of servers. 
  • Conduct a complete investigation and Incident response on the whole eco system.
  • Run vulnerability assessment scans and audits on all Internet-facing systems. 
  • Make sure all Windows security updates are installed (not only service packs).
  • Make sure all software is up to date. 
  • Check Exchange rules that are configured on user accounts, especially for high ranking executives. (to block internal executables transfer from exchange)
  • Monitor your firewalls (at boundaries and internal subnets) to identify machines generating suspicious traffic. 
  • On your Proxy/firewall make sure relevant log retention is in place. 
  • Increase event logs size– see https://technet.microsoft.com/en-us/library/cc748849(v=ws.11).aspx.

I hope this helps other timely to prevent such from happening and guide those already in such situation. Definitely much less can be done once such has already happened but yes lot of things can be done beforehand to prevent bad thing from happening in first place. Feel free to connect with us timely so we can help fill up the gap if any.

 

Why Robotic Process Automation (RPA) is more than just scripting

RPA vs “traditional” technology solutions such as macros and scripts

When asked if Robotic Process Automation (RPA) can be likened to macros and scripts, replying “yes” is actually too simplistic an answer, and understates the significant additional value and potential from RPA vs “traditional” technology solutions, such as macros, scripts and similar technologies.

Before RPA, similar automations were implemented via scripts and macros. However, one was extremely limited to what was possible to achieve.  These scripts were working relatively well when interacting with a single app (think Selenium for web automation), but when it comes to interacting within multiple applications, things tend to get a lot more complicated.

Another major difference is that RPA is autonomous of the application. Where you might need multiple scripting tools to create scripts to perform in your various applications, RPA can interact with multiple applications at once at the object layer. It can be applied to virtually any application, and multiple applications at a time.  Indeed, RPA’s root value driver is that the technology allows the user to interact with any type of application whether it is web, windows classic, wpf, Java, PDF, Citrix.  One could try to replicate scripts and while that might work for one or two types, things get very complicated as you try to add more functionality to the mix. If you were to write an automation from zero, one might be able to write a Proof of Concept but scaling it to production code could take 10’s if not 100’s of times more work in treating all the exceptions and edge cases. Then, when you think you are done, an application update will hit, and you have to re-start.

RPA Capabilities

Then there are the orchestration capabilities. UiPath has invested in this for the last several years and reached the point where they now have a mature product in this space. Scheduling robots, making sure that they handle assets (like windows credentials) securely, having reliable queuing mechanisms are just a few of the capabilities offered.

The current and near-term future capabilities of RPA vastly exceed those of scripting, macros, etc. “Bots” can be created by trained SMEs in the business, using smart process recorders and drag and drop functionality. Most code is produced in the background, removing the need to involve IT in configuring most automations. Whereas IT needs to partner with the business when implementing RPA and be aware of steps to utilize RPA to automate business processes, RPA tools are designed to be built and maintained by the same operators that perform these processes today.

Macros and scripts are programming, with short sequences of code written to perform a single task, or a series of tasks. While a macro or script is linear and fixed, RPA robots are dynamic. They can “learn” and respond to stimuli, accumulating knowledge of procedures over time – thereby getting “smarter.”

Further, leading RPA tools include functionality that goes beyond the capability of scripts and macros, such as optical character recognition (OCR) and the ability to include artificial intelligence. These tools have built in access to functionality through Google Cloud such as Cloud Vision, Cloud Translation and Cloud Natural Language. We believe that RPA is a foundation with the capability for organizations to master and then build upon, adding cognitive and AI capabilities.  This is not possible with macros and scripts.

Finally, some RPA tools also come with a “Studio”, which is effectively a process development environment. When using scripts and macros you need developers to create and maintain these automations. This RPA ‘Studio’ takes this to a whole new level, where you can have business users (that have a good understanding of the processes they are automating and limited programming knowledge) now automate those processes.

Attended versus Unattended Automation

In addition to this, there is the potential for unattended automation. Having robots that can start jobs on computers where no-one is logged-in within a secure environment is not an easy job. And doing this in a reliable form when you have thousands of jobs that need to start at precise times of the day makes these things quite complicated.

Attended automation is useful when the entire end-to-end process is not capable of being automated. Attended bots can work alongside humans, triggered by system-level events that can give and take data to and from human workers. Attended robots optimize tasks by offloading portions of them, helping work get done faster. For example, a call center agent can get help from an attended robot in near real time during a live customer call. The attended robot can find customer data from one application and automatically type it into a second application. This way, the call center agent spends less time switching between applications and can focus on high-value tasks such as solving the customer’s problem. Attended robots tend to be dedicated to one individual or one machine, and typically “work” while the employee is working.

Unattended automation executes tasks and interact with applications independent of human involvement. Unattended robots can be triggered by events and they can be scheduled. Unattended robots typically perform batch operations that do not require user intervention. For example, a batch of new client information is received in a spreadsheet and needs to be entered on multiple applications. Unattended robots can be shared across many employees and can “work” 24 x 7 x 365.L.

Being IT Architect

As per TOGAF standard, it is IT Architect role that connects the IT with Business,

 

In order perform this role, having a a technical knowledge of product "A" for example is usually not sufficient for this role owner. One also need to know the costs associated, and licensing model of the solution plus at least some knowledge of the competing products. Without grip on these it is NOT possible to propose viable solution based on context of costs and market segment of the product overall.

TOGAF process to the development of architecture is very straightforward,

Computing of 2020: Virtualization based on Docker Containers and Kubernetes

Computing of 2020: Virtualization based on Docker Containers and Kubernetes

 

In past with the advent of windows, it was possible for one processor to write to a memory location which was in fact being used by Kernel of the OS or by another process as example. Process could halt the total execution of processor as well. So the testing was extremely difficult because things were highly tied together all the way from host to all the applications running on top of that OS and their states. There actually was no isolation until the windows 95 came into picture 25 years back:  it came with virtual addresses spaces and the feature called pre-emptive multitasking. https://www.windowsdatarecoverysoftware.com/win95/. This provided OS the feature to expose address space as virtual thing and actual addresses are traslated by the processor to somewhere else. This allowed any process which have coding issues, or bugs stop from brining the whole operating system down. The OS is able to bring down this process by forcefully taking the CPU from the process. This was made possible by the help of processor feature basically called virtual machine. There V8086 mode introduced by processor which helped enable this feature  https://en.wikipedia.org/wiki/Virtual_8086_mode.

 

The virtual 8086 mode eventually after decades transformed into a virtualization technology where multiples OS could share the processor and run in total isolation to the other OS and could not affect the whole server unconditionally. Vmware and Hyper-V became major players and it helped both the production in terms of consolidation of workloads on same physical host server as well as Dev environment because rolling back and spinning up environments became feasible in terms of space and time.

 

Soon it was realized that repeatition of OS kernel in each Virtual running on same host not just has its own disk space and CPU utilization cost, it is just not very needed. Docker came with a concept of sharing kernel in a way that each process is totally isolated from other process and see the OS totally independent to another process. That process is called a container. So cost of spinning a new process is much less also and very less overhead on disk and CPU so it became an easy success.

 

 In order to support container definitely the support from the operating system was also essential and its over size also became an important consideration. Linux in its current state could not be containerize either and needed some changes. Microsoft had to go to major changes because GUI based bulk load servers definitely could not be containerized. They came up with stripped down OS. So the servers went into below changes

 

Docker vs Virtual Machine

Docker is process virtualization where a process sees the host as isolated from all other processes on same host. Virtual machine is Operating System virtualization where multiple OS runs on same hardware without knowing the existence of each of other.

  

Disk Sizes

Microsoft Windows Full edition (10 GB install size) while windows server core edition takes lesser space i-e around 5-6 GB. There is newest edition to the family called Nano-Server which is the minimal server edition from Microsoft. Total Install size of this OS is around 250MB on disk while the source docker image is approx 100mb only. 

  

Licensing is very simpler also, 

source: docs.microsoft.com:

 

Memory Requirements,

Hyper-V Isolation vs Process Isolation


source: https://docs.microsoft.com/en-us/virtualization/windowscontainers/manage-containers/hyperv-container

Shared Kernel Consideration

Base images are special OS. You need to use the matching base image corresponding to the host OS plus you need to consider the patch level of the HOST OS as well, otherwise you need to use Hyper-V Isolation,

Docker Engine and competing products

Key Take Aways,

> Docker is the leading de facto container standard as well as implementation as of now with MS also integrating it with Windows to provide necessary docker infra to windows servers.

> Container is essentially process isolation. Note to be heedful when used on windows server, if you are using Hyper-V isolation instead, which should not be the case. 

> The benefit to DevOps is much better CI/CD pipeline and cost to create new a new environments is much much less and testing is much streamlined

> The benefit to production is consolidation of workloads at much higher scale, rapid provisioning of service. No VM boot time. No disk overhead, No Ram overhead, No Mem overhead

> portability of the solution increases many folds similar to virtual machine but since docker images are even smaller so portability is even more enhanced than VM.

> The layered approach of images through docker file focuses on your customizations derived from the base images which are available in docker hub around the globe so essentially you just are carrying the customization and your binaries or web publish folder to carry your implementation. That really increases portability many folds. you are assured your code will arrangement will run successfully without having to carry a big virtual machine hard disks for example.

> Linux or Windows does not matter much with dot net core becoming cross platform capable and MS putting all efforts on .Net core instead and abandoning .Net FX means dot net core will be main thing. This incase if you service runs on net core.

> New Windows servers are no more bloated with GUI, rather windows Nano servers are 100MB image size and 250MB instance size when executed so you get all the benefits of Linux in there along with standardization of MS world.

> You really don't have to be docker file expert, although its good to know the parameters to fine tune but for most cases, you can get the docker file auto created as part of visual studio which you can carry along in your production and change it if needed.

 

 

Referenced Code: Shahid-Roofi-Khan/BlazorDockerSampleForWin2019 (github.com)

How to install Docker on Windows Server: https://docs.microsoft.com/en-us/virtualization/windowscontainers/quick-start/set-up-environment?tabs=Windows-Server

New cluster model for Windows 2019 allows router USB to be used as witness node

Cluster arbitration needs a third computer hosting a share folder as judge to facilitate automatic failover between two computers hosting high available service.

However it was costly to have one. It must be joined to domain, must not be a domain controller and must not be down and can't be in DMZ as domain authentication must also succeed.

That is really easy on hardware and software. Good News is MS Windows Server 2019 remove ALL! of these requirements.

This means NO kerberos , NO domain controller , NO certificates , and NO Cluster Name Object needed. While we are at it, NO account needed on the nodes. Oh my!!

Just use even your router's USB share to arbitrate your internal clusters !!!!

Now it does make a lot of sense to use your cloud machine to arbitrate your internal on premise cluster setups!!!

Just need that atleast SMB 2.0 level file sharing is supported by your shares.

#CloudWitness #Windows Server 2019 #NewFileShares #Security