Quantcast
Channel: Ask the Directory Services Team
Viewing all 29 articles
Browse latest View live

AGPM Production GPOs (under the hood)

$
0
0

Hello, Sean here. I’m a Directory Services engineer with Enterprise Platforms Support in Charlotte. Today, I’d like to talk about the inner workings of Advanced Group Policy Management (AGPM). Let’s begin by discovering what occurs behind the scenes when you take control of a Production GPO using AGPM.

The term “Production GPO” is used frequently in AGPM documentation to describe an existing GPO in Active Directory and differentiate between it and the copy that AGPM stores in the Archive to allow for “Offline Editing”.

For those new to AGPM, it provides many features to help you better manage Group Policy Objects in your environment. Role-based administration allows you to delegate certain actions to users, even those that may not be administrators. The four built-in roles are Reviewer, Editor, Approver and Administrator. Change-request approval helps to avoid unexpected and unapproved modifications to production GPOs. AGPM also provides the ability to edit GPOs offline, allowing for review and approval of the changes before committing them to production. Furthermore, version tracking of GPO changes, the ability to audit/compare versions and the rollback feature can help you recover from GPO changes that need to be revised. The Overview of Advanced Group Policy Management white paper (Link) has information about these features and more.

Environment Overview:

The environment has three computers: a domain controller, a member server, and a client.

  • CONDC1 : Windows Server 2008 R2 Domain Controller
  • CONAGPM : Windows Server 2008 R2 AGPM Server
  • CONW71 : Windows 7 AGPM Client

The AGPM server and client computers are members in the contoso.com domain. This scenario uses the 64-bit version of AGPM for server and client installations, but a 32-bit version is available as well. The AGPM server and client installs were done following the Step-by-Step Guide (Link). This document is also included on the MDOP disk (..\Documents\4.0\AGPM_40_Step-by-Step_Guide.pdf).

clip_image001

Tools Overview:

The following tools will be used to gather data during this exercise:

  • Microsoft Network Monitor (Link) will be used to capture the network traffic that is generated between each computer.
  • Process Monitor (Link) is a Windows Sysinternals utility that we will use to monitor the activity of individual processes running on each computer during the exercise.
  • Group Policy Management Console (GPMC) logging will be enabled (Link), in order to track the operations performed by this MMC snap-in on each computer. This will allow us to point out any differences between the snap-in’s behavior between the different computers.
  • Active Directory Object Auditing will be enabled (Link), notifying us of any changes to Active Directory Objects that we configure for auditing. This will generate events in the computer’s security event log.
  • Advanced Group Policy Management logging (Link) is configured via Group Policy. This will be enabled in order to see exactly what the AGPM components are doing on each computer.

Prologue:

Before we begin, it’s important to understand how AGPM is able to delegate management of GPOs to non-Administrators. Delegation of the various AGPM roles is done within AGPM itself. All operations performed by AGPM in the domain are handled by the AGPM service account. During the AGPM server installation, you specify what account you wish to use as the AGPM service account. This single account is granted the permissions to create, delete and manage GPOs in the domain. When we start GPMC as a user who has delegated permissions within AGPM, even if the user account has no rights to manage GPOs by itself, AGPM instructs the service account to perform the actions on the user’s behalf.

When performing data collection on multiple systems like this, it’s important to understand how each component works, and under what security context it’s working. For this task, I’m logged into CONW71 with my AGPM Administrator account (agpmadmin). The changes I make through the AGPM console on CONW71 are commands sent through GPMC.msc as the user agpmadmin. Even though I request to change the status of a GPO that is located on a domain controller, the commands sent from CONW71 go to the AGPM service running on CONAGPM. On CONAGPM, the AGPM service receives those commands and evaluates what permissions the submitting user account has been granted.

Based on the role of the user submitting the commands to the AGPM service, the action will be allowed or disallowed. If the user has the appropriate permissions, the AGPM service builds the request to send to the domain controller and forwards it, not as the user who initiated the requests, but as the AGPM Service account. Since the AGPM service account is being used for the request sent to the domain controller, access is based on the permissions assigned to the AGPM service account.

Getting Started:

First, we’ll log into CONDC1 and create a few Organizational Units (OU) named “Development”, “HR” and “Sales”. By right-clicking on the OUs and selecting “Create a GPO in this domain, and Link it here”, we will create the new GPOs that will automatically be linked to their respective OUs. CONDC1 doesn’t have the AGPM server or client installed, so we will use the vanilla Group Policy Management Console (GPMC.msc). For the sake of today’s blog post, we’ll only be working with the “Dev Client Settings” GPO. Let’s add a few drive mapping GP Preference settings, just to make it seem a bit more authentic. Before we do anything further to the GPO, let’s make note of a few key details regarding the GPO.

  • The GPO GUID : {01D5025A-5867-4A52-8694-71EC3AC8A8D9}
  • The GPO Owner : Domain Admins (CONTOSO\Domain Admins)
  • The Delegation list :Authenticated Users, Domain Admins, Enterprise Admins, ENTERPRISE DOMAIN CONTROLLERS and SYSTEM

Second, we want to get each of our data collection tools ready to capture data. Logging options will be configured for GPMC and AGPM. Active Directory Object Auditing will be enabled, and our GPO will have auditing configured to report any attempted change, successful or not. Network Monitor and Process Monitor will be started and tracing on all three computers right before we take control of the production GPO.

Next, we’re ready to take control of the GPO using the AGPM client installed on CONW71. Computers that have the AGPM client installed have a new “Change Control” entry within GPMC. This is where we will perform most of the functions that brought us to install AGPM in the first place. On the “Uncontrolled” tab, we see a list of GPOs in the domain that are not currently controlled by AGPM. Let’s right-click on the “Dev Client Settings” GPO, and bring up a context menu where we select the “Control” option.

image

If we hold the delegated role of AGPM Admin or Approver, we’ll be prompted to add a comment for this operation. Without Admin or Approver, we’ll be asked to fill out a request form that will be emailed to the AGPM Approvers first. It’s always a good idea to comment with something meaningful, explaining why we’re taking ownership of this GPO. It’s not always obvious why changes were made to a GPO, and the comment is our chance to inform others of the reasons behind our action. If your organization has change control procedures, it would be an excellent place to link the action to the official change request identifier.

Assuming we have the permissions to take control of a production GPO, when we add our comment and click “Ok”, we will see a progress window appear. It will update itself with the progress it’s making on our request. It should report whether the operation was successful or not, and if not it should give us some additional information regarding the problem(s) it ran into.

Simple enough on the front end, but what exactly is taking place behind the scenes while we made those flew clicks? Let’s take a look…

The AGPM Client

Network Monitor on the AGPM Client shows some TCP chatter back and forth between an ephemeral port on the AGPM client, and TCP Port 4600 on the AGPM server. TCP 4600 is the default port when installing the AGPM Server component, but you can change that during the install or after (Link) if you prefer. There is no communication between the AGPM client and the domain controller other than ARP traffic. The process making the calls to the AGPM server is MMC.exe.

image

Process Monitor on the AGPM Client is similarly sparse on information. MMC.exe accesses the registry and file system briefly as it builds the request to send to the AGPM server, and writes to the agpm.log file under the profile of the logged on user.

GPMC logging (gpmgmt.log) seems to generate many entries, but there were none generated on the AGPM Client during the test.

AGPM logging on the client shows a number of actions being taken between the AGPM Client and AGPM Server. The control operation appears between two [Info] entries, and shows the various functions being called by the AGPM client to process and report the results from the operation to the user.

image

The AGPM Server

Moving to the AGPM Server, we can see a difference in behavior from nearly every data point.

The network capture from the AGPM Server shows the TCP communication back and forth with the AGPM Client followed by TCP and LDAP packets between the AGPM Server and the Domain Controller. Once the commands have been received from the AGPM Client, the AGPM Server initiates the requested actions with the Domain Controller. The request to change the GPC and its contents comes in the form of SMB SetInfo Requests.

image

If we drill down into the packet info, into the SetInfo Request… we’ll see the modified object:

image

And further down, the DACL changes:

image

The highlighted SID is for the AGPM Service account in our domain. We can get the user account SID for the AGPM service account by looking up the objectSID attribute of that user account within ADSIEdit.msc. 0x001f01ff is the equivalent of Full Control. Notice, the owner is still set to S-1-5-32-544 (Built-In/Administrators). This is the case for every file and folder within the GPT except for the top level folder named after the GPO’s GUID. Here we see the AGPM Service account’s SID again.

image

After the AGPM Service account has permissions, you can see it start to query the domain controller via LDAP and SMB2, copying over the GPO to the AGPM server. This is the AGPM server creating a copy of the GPO in the Archive you created during installation of the AGPM Server.

Process Monitor on the AGPM Server is very busy. First, the service checks for the Archive path, and reads through the gpostate.xml file, checking to see if it already knows about this GPO. The gpostate.xml file contains a historic view of GPOs known to AGPM. We see some LDAP communication between the AGPM server and the Domain Controller that corresponds to the AGPM server modifying permissions on the portion of the GPO that resides in Active Directory. This is followed by the AGPM service exploring the entire folder structure of the GPO’s SYSVOL component, modifying the DACL and Owner information to include the AGPM service account.

In order to provide the ability to edit GPOs offline, AGPM makes use of the Archive to store a copy of each GPO it controls. The Process Monitor capture from the AGPM Server gives us a very good look at what’s going on between SYSVOL and the archive.

image

We see it start to dig into the Group Policy Template for the GPO we’re taking control of, reading the information from the folders and files beneath it. In the next image, we see the AGPM service query the registry for the location of the Archive.

image

We also see below that it reads from a Manifest.xml file. This is a hidden file that has some basic information about every GPO in the Archive. Things like the GPOs production GUID, the domain and domain GUID, as well as the AGPM-assigned GUID.

image

After this, the AGPM service starts to create a folder structure within the Archive for the GPO. What’s interesting here is, closer scrutiny reveals an uncanny resemblance to a standard GPO backup routine. If you’ve ever backed up a GPO using GPMC, you’ll recognize the files and folder structure created by AGPM when it adds a GPO to its archive.

image

Notice the GUID in the Archive path. AGPM creates its own unique identifier for the archived copy of the GPO. Process Monitor shows the AGPM service going back and forth between SYSVOL, reading info and writing it into the Archive. The AGPM service pulls the settings from the GPO and creates a gpreport.xml file with that information in it. GPReport.xml also has the following information within it:

  • GPO Name, Created Time, Modified Time and Read Time
  • Security Descriptor (Security principal SIDs with SDDL permissions)
  • Additional info regarding each Security Principal

Two other files in the archived GPO’s folder are Backup.xml and bkupInfo.xml (Hidden). Backup.xml contains the following information:

  • The list of Security Principals on the GPO, along with additional information about each
  • The actual settings from the GPO itself
    • Security Descriptor (in hex)
    • Options
    • UserVersionNumber
    • MachineVersionNumber
    • CSE GUIDs

BkupInfo.xml is essentially an excerpt directly from Manifest.xml of the info that pertains to this GPO.

AGPM logging on the AGPM server doesn’t generate many entries during the control operation. It shows the incoming message, identifies the Client/Server SIDs (The user account SIDs of the user initiating the action on the AGPM Client, and the AGPM service account being used by the AGPM Server), and calls the appropriate functions. The control operation has the AGPM Server sending requests to check the GPO’s security (doGpoLevelAccessCheck()) and then take control of the GPO (ControLGPO()).

image

GPMC logging on the AGPM Server gives us a wealth of information. Without much delay, you see the GPMC log record a LDAP bind and permissions being modified on the GPO objects within Active Directory.

image

The next thing you’ll notice in the GPMC logging on the AGPM Serer is reference to Backup related functions being called. Remember seeing the AGPM server accessing the Group Policy Template and Container seen in other data collections? When the GPO is copied to the AGPM Archive, this is essentially a GPO backup, very much like the one you can perform in GPMC.msc. The remainder of the GPMC log was dedicated to covering the backup processes.

image

The Domain Controller

This is the last stop in our data analysis. The network capture shows the traffic from the AGPM Server. Process Monitor, however is a bit different. Where the AGPM Server had a lot of entries specific to our operation to control the GPO, all of the information in Process Monitor on the Domain Controller shows up as reads/writes to the Active Directory Database (NTDS.DIT). Process Monitor does not allow us to see what was being read/written, so they are fairly useless for really seeing what’s going on.

The Security log has generated many events, just in the short time it took to take control of this GPO. We can see the AGPM service account connect and read various attributes of the Group Policy Container from Active Directory. We’ll also see a single event for the actual modification of the Group Policy Container (GPC) replacing the current nTSecurityDescriptor information with one containing permissions for the AGPM Service Account.

image

The Object Name value in the event data corresponds to the objectGUID of the GPO’s container object within Active Directory.

Since AGPM nor GPMC was utilized on the Domain Controller, there are no corresponding logs to review from those tools.

In Closing

We’ve pulled the curtain away from a very simple procedure of taking ownership of a production GPO, reviewing it from different perspectives using different tools, and found it’s a very simple task that is broken up into a few common subtasks.

  • The AGPM service takes ownership of the GPO and adds itself to the DACL with Full Control, both on the Group Policy Container within Active Directory and the Group Policy Template in SYSVOL.
  • The AGPM service then performs a GPO backup to a specified location (the Archive).

Once the GPO is controlled by AGPM and backed up to the Archive, a number of other tasks can be performed on it, which we will cover in depth in future blog posts.

Complete series

http://blogs.technet.com/b/askds/archive/2011/01/31/agpm-production-gpos-under-the-hood.aspx
http://blogs.technet.com/b/askds/archive/2011/04/04/agpm-operations-under-the-hood-part-2-check-out.aspx
http://blogs.technet.com/b/askds/archive/2011/04/11/agpm-operations-under-the-hood-part-3-check-in.aspx
http://blogs.technet.com/b/askds/archive/2011/04/26/agpm-operations-under-the-hood-part-4-import-and-export.aspx

 

Sean “right angle noggin” Wright


AGPM Operations (under the hood part 2: check out)

$
0
0

Sean again, here for Part 2 of the Advanced Group Policy Management (AGPM) blog series, following the lifecycle of a Group Policy Object (GPO) as it transitions through various events. In this installment, we investigate what takes place when you check-out a controlled GPO.

Before editing an AGPM controlled GPO, it is checked out. There are several potential points of failure for the check-out procedure. Network communications during the backup can drop, leaving the Archive copy only partially created. Firewall rules can block network traffic, preventing the AGPM client from contacting the server. Disk corruption can cause the Archive copy of the GPO to fail to restore. We use the same tools to collect data for these blog posts and to troubleshoot most issues affecting AGPM operations.

In Part 1 of this series (Link) we introduced AGPM and followed an uncontrolled “Production” GPO through the process of taking control of it with the AGPM component of the Group Policy Management Console (GPMC). If you are unfamiliar with AGPM, I recommend you refer to the first installment of this series before continuing.

Environment Overview:

The environment has three computers: a domain controller, a member server, and a client.

  • CONDC1 : Windows Server 2008 R2 Domain Controller
  • CONAGPM : Windows Server 2008 R2 AGPM Server
  • CONW71 : Windows 7 AGPM Client

For additional information regarding the environment and tools used below, please refer to Part 1 of this series (Link).

Getting Started:

We start on our Windows 7 computer logged in as our AGPM Administrator account (AGPMAdmin). We need GPMC open, and viewing the Change Control section, which is the AGPM console. We are using the “Dev Client Settings” GPO from the previous blog post, so let’s review the GPO details.

  • The GPO GUID :{01D5025A-5867-4A52-8694-71EC3AC8A8D9}
  • The GPO Owner : Domain Admins (CONTOSO\Domain Admins)
  • The Delegation list : AGPM Svc, Authenticated Users, Domain Admins, Enterprise Admins, ENTERPRISE DOMAIN CONTROLLERS and SYSTEM

We also log into the AGPM Server and the Domain Controller and start the data capture from each of the tools mentioned previously.

As with most actions within AGPM, checking out a GPO is a simple right-click and select operation. Right click the “Dev Client Settings” GPO to bring up the context menu and select the “Check Out…” option.

image

Notice the grayed out “Edit” option for a checked in GPO. AGPM prompts for comments from logged on accounts with the AGPM Admin or Editor role delegated. Clicking “Ok” displays a progress window that updates us as the AGPM server request is processed. When it is complete, we return to the AGPM console and see the changed status.

image

Notice how the AGPM console differentiates between a "Checked out" GPO, and one that is "Checked in". The icon has a red outline, and the “State” column updates. The “Comment” column displays the comment entered during the most recent operation on the GPO; it is useful to add relevant information to the comment whenever possible.

Let’s look at the data we’ve collected for the Check-Out operation.

The AGPM Client

Network Monitor shows the AGPM Client and AGPM server communications. TCP port 4600 is the default for the AGPM server; this is configurable during the installation or afterwards (Link).

image

Process Monitor on the AGPM Client highlights the simple nature of the work done by the AGPM Client itself during the Check-Out procedure. The MMC process accesses gpmctabs.dll to generate the AGPM console, followed by access to agpm.log to write entries related to the communications between the AGPM Client and Server.

There were several entries in the GPMC log (gpmgmt.log) pertaining to the opening of the GPMC.msc snap-in, and looking up each of the accounts defined in the delegation tab for the GPOs. There were no entries in the log during the time of the Check-Out operation, however.

AGPM Logging shows the exact same block of entries that we saw when taking control of a production GPO.

image

This log only shows entries related to the client establishing a connection with the AGPM server and sending it the instruction “ExecuteOperations()” and that the instruction has been completed.

The AGPM Server

Since we focused on the traffic between the AGPM Client and Server in the section above, we now examine the traffic between the AGPM Server and the Domain Controller. The first thing we notice is a lot of SMB traffic with the AGPM Server regarding a policy GUID that is different from that of the GPO we are checking out.

image

image

A search of the network trace for the “Dev Client Settings” GPO {01D5025A-5867-4A52-8694-71EC3AC8A8D9} turns up nothing. A quick refresh of GPMC.msc shows a brand new GPO in the list.

image

There are several important bits of information in the screenshot above. First, notice the name of the GPO “[AGPM] Dev Client Settings”. The GUID is the one we see in the network trace. Notice the "Created Date/Time": it's the “Dev Client Settings” GPO check-out time. The GPO is not linked anywhere, the GPO history does not match that of the GPO it shares a name with and the Delegation list shows full control granted to the account that checked it out. From here on, we refer to this GPO as the “Offline GPO”.

image

Within the network trace, we see the request to create the policy GUID folder in SYSVOL.

image

AGPM takes the same action to create the rest of the policy folder structure and contents. Security is set on these folders as well.

The AGPM Server log (agpmserv.log) shows the entries related to the process it goes through "IAgpmServer.SendMessage()" to send the appropriate messages along to the Domain Controller to perform the actions we’ve requested via the AGPM console.

image

Process Monitor shows entries that confirm its writing to the Agpmserv.log file. It retrieves the registry path to the AGPM Archive and we access gpostate.xml (located within the Archive). As mentioned in the first blog post, gpostate.xml contains a historic view of GPOs known to AGPM.

image

It reports gpmgmt.log accesswithin Process Monitor as well. It’s important to note the user account in the path. The security context of the account that is performing the actual GP management work is the one logging all of the entries.

image

The AGPM Server accesses the Archive path and copy the GPO folder and contents to a path beneath the Archive’s Temp folder.

image

Next, we see the creation of the “Offline” GPO path in SYSVOL. GPMC builds out the new GPO based on the information copied from the AGPM Archive.

image

The gpmgmt.log created in the AGPM service account’s profile path shows the process taken to build the new GPO folder from the AGPM Archive copy. The log addresses each aspect of the GPO, from assigning security to configuring the GP settings. The process looks like a GPO Restore.

image

image

The Domain Controller

Looking at the network capture on the domain controller shows very little from the client (CONW71). SMB protocol negotiation, session setup and connection from the client to the DC’s IPC$ share are shown. We reviewed the network traffic from the AGPM server earlier in this post.

The DC security log shows several "Security-Auditing” 5136 events generated by the creation of the Offline GPO.

image

Editing the GPO

We now see what has taken place during a controlled GPO Check-out. Let’s modify the GPO slightly, adding a setting or two. On our AGPM Client (CONW71), right clicking on the checked-out GPO brings up the context menu, and the “Edit” entry is now clickable.

image

Notice the two new entries to the context menu, “Check In…” and “Undo Check Out…” We’ll come back to those in a bit. Editors and Administrators alone hold the ability to edit a GPO controlled by AGPM, so if we see the option still grayed out on a checked-out GPO, we need to make sure we have the appropriate permissions within AGPM. There is no prompt for comment within AGPM when editing a GPO. Windows 2008 and later allows us to comment at the GPO level as well as at the setting level (within Administrative Templates), if we need to. With the Group Policy Editor started, we can use it to make changes to the checked out GPO.

If we decide to check the GPO back in without saving any changes, we can select “Undo Check Out…”. This simply deletes the Offline GPO created during the Check-Out procedure, and removes the reference to it in gpostate.xml.

In Closing

In this second installment, we covered a procedure repeated every time there’s need to modify a GPO within AGPM. During the Check-Out of a GPO, the following steps are performed:

  • The Archive copy of the GPO is copied to a temp folder.
  • From the duplicated Archive data, a new “Offline” GPO is created with the [AGPM] prefix by performing a GPO Restore.
  • The GPOs entry within the Archive’s gpostate.xml file is updated to reflect its checked-out state, and references the newly created “Offline” GPO.
  • Once the Check-Out procedure is complete, the temp copy of the Archive data is deleted.
  • The “Offline” GPO is not linked anywhere, and edits to it are made in real-time.

From this information, we can make an important observation: any changes made to an AGPM-controlled GPO outside of the AGPM console (i.e. the rogue Domain Admin that doesn’t bother with the AGPM console and edits the GPO directly through GPMC.msc) are overwritten the next time the GPO is deployed from the AGPM console. Since the Check-Out procedure builds the editable “Offline” GPO from the AGPM Archive data, the Admin’s changes are not included automatically. We do have the option of using the “Import from…” feature to pull the settings from the production GPO again prior to the Check-Out, which updates the Archive data with any changes made outside of AGPM.

Come back for Part 3 of this series, where we will check our GPO back in.

Complete series

http://blogs.technet.com/b/askds/archive/2011/01/31/agpm-production-gpos-under-the-hood.aspx
http://blogs.technet.com/b/askds/archive/2011/04/04/agpm-operations-under-the-hood-part-2-check-out.aspx
http://blogs.technet.com/b/askds/archive/2011/04/11/agpm-operations-under-the-hood-part-3-check-in.aspx
http://blogs.technet.com/b/askds/archive/2011/04/26/agpm-operations-under-the-hood-part-4-import-and-export.aspx

Sean "To the 5 Boroughs " Wright

AGPM Operations (under the hood part 3: check in)

$
0
0

Sean again, here for Part 3 of the Advanced Group Policy Management (AGPM) blog series, following the lifecycle of a Group Policy Object (GPO) as it transitions through various AGPM-related events. In this installment, we investigate what takes place when you check-in a controlled GPO.

Before editing an AGPM controlled GPO, it is checked-out. Similarly, after editing the GPO, it is checked in before the changes are deployed to production. Many of the same failure points exist for both the check-out and check-in processes. Network communications during the restore can drop, leaving the production GPO only partially updated. Disk corruption can cause the Archive copy of the GPO to fail to restore correctly. The AGPM service account could fail to authenticate when attempting to perform the requested operation. We use the same tools to collect data for these blog posts and to troubleshoot most issues affecting AGPM operations.

In Part 1 of this series (Link), we introduced AGPM and followed an uncontrolled “Production” GPO through the process of taking control of it with the AGPM component of the Group Policy Management Console (GPMC). If unfamiliar with AGPM, I would recommend you refer to the first installment of this series before continuing.

Part 2 of the series (Link) continued the analysis of this GPO as it was Checked-Out using AGPM. We revealed the link between AGPM controlled GPOs and the AGPM Archive as well as how AGPM provides for offline editing of GPOs. If you haven’t read Part 2, I recommend doing that now.

Environment Overview:

The environment has three computers: a domain controller, a member server, and a client.

  • CONDC1 : Windows Server 2008 R2 Domain Controller
  • CONAGPM : Windows Server 2008 R2 AGPM Server
  • CONW71 : Windows 7 AGPM Client

For additional information regarding the environment and tools mentioned below, please refer to Part 1 of this series (Link).

Getting Started:

We start out on our Windows 7 computer, logged in as our AGPM Administrator account (AGPMAdmin). We need GPMC open, and viewing the Change Control section, which is the AGPM console. We are using the “Dev Client Settings” GPO from the previous blog post so let’s review the GPO details:

  • The GPO GUID : {01D5025A-5867-4A52-8694-71EC3AC8A8D9}
  • The GPO Owner : Domain Admins (CONTOSO\Domain Admins)
  • The Delegation list : AGPM Svc, Authenticated Users, Domain Admins, Enterprise Admins, ENTERPRISE DOMAIN CONTROLLERS and SYSTEM

We also want to log into the AGPM Server and the Domain Controller and start the data capture from each of the tools mentioned in the previous section.

Picking up where we left off from the previous blog post, we now have our GPO checked out and modified with some new settings. When we’ve made the desired changes to the Group Policy Object, we close the Editor and return to the AGPM Console. In order to check it back in, we right-click the GPO in the AGPM console and select the “Check In…” option. We have the option to enter a comment for the check-in operation. The red-outlined GPO icon returns to normal once checked back in.

The AGPM Client

As we might expect, Network Monitor shows traffic is mainly between the AGPM Client and AGPM Server. It is TCP traffic between the client and port 4600 on the AGPM Server.

image

Process Monitor shows MMC writing to the AGPM.log file, but otherwise has few entries that relate to the Check-In process. As before, this shows the AGPM client does not perform any of the operations on the GPO itself. It simply relays the instructions to the AGPM Server.

There were no entries generated in the GPMC log during the Check-In operation. Considering the only entries in the log pertained to the startup of GPMC, these actions within the AGPM console obviously do not flag any GPMC logging events.

The AGPM.log shows nearly identical information in the Check-In operation as it did in the Check-Out. The AGPM Client contacts the AGPM Server and notifies it of incoming instructions. When the AGPM Server is ready, the AGPM Client sends the instructions and awaits return information. Once the AGPM Server returns the resulting data the function exits successfully.

image

AGPM Server

We covered the AGPM client network traffic in the previous section. Once the AGPM client gives instructions to the AGPM server, that server opens an LDAP connection to the Domain Controller. The AGPM server accesses the checked out GPO information within Active Directory and SYSVOL. While we can’t see exactly what’s being read from the directory, we do see the SMB traffic as the AGPM server reads the information from SYSVOL.

image

Process Monitor shows quite a lot of activity from the Agpm.exe process. It starts out by looking up the AGPM Archive path from the registry, and accessing gpostate.xml to determine the status of the GPO.

image

Within the gpostate.xml, each GPO has its status and check-in history listed.

image

The "agpm:type" entry indicates the “CHECKED_OUT” status, the time of the operation, the comment entered during the check-out operation and the SID of the user performing the operation. This is also where the reference to the "agpm:offlineId" is found, which is the Offline GPO's GUID created during the Check-Out process.

The AGPM process then looks to the manifest.xml file, which contains entries for every time a GPO was backed up to the AGPM Archive. From Part 1 of this blog series, we learned taking control of a production GPO initiated a backup of that production GPO into the AGPM Archive. At this point, AGPM.exe uses the manifest.xml to check the current backup status.

image

Next, we see the AGPM server read the SYSVOL folder for the Offline GPO, and start verifying the folder structure within the AGPM archive matches.

image

image

AGPM then copies files from the GPO’s SYSVOL folders to their corresponding location in the AGPM Archive path. Here we see the copy of the Computer Configuration registry settings file.

image

Once copied, AGPM updates the manifest.xml and bkupInfo.xml files within the GPOs Archive folder.

image

Where the bkupInfo.xml file contains the information of the GPO it has created, manifest.xml has a copy of that same information for every GPO in the Archive. The following is the bkupInfo.xml for the GPO check-in.

image

AGPM updates Backup.xml with the modified GPO’s security settings, as well as any new GP Extensions required. GPreport.xml contains all of the settings within the checked out GPO.

Now that the checked out and modified GPO is backed up to the Archive, the gpostate.xml file is updated to reflect the new “CHECKED_IN” status of the GPO. Notice the AGPM Archive path has changed from {85B77C99-1C4B-473C-A4E5-0AF10DD552F9} to {CD595C25-5EC6-4653-8E24-0E640588C654}.

image

It’s important to note what we do not see here: AGPM does not write the modified GPO to SYSVOL under the production GPOs GUID {01D5025A-5867-4A52-8694-71EC3AC8A8D9}. This is evidence that checking in a GPO we modified in AGPM does not commit the changes to production. In order to do that, we must ‘Deploy’ the GPO within AGPM.

Reviewing the gpmgmt.log entries from the Check-In operation mirror much of what we saw in Process Monitor. AGPM backs up the Offline GPO to a newly created Archive path, and then updates gpostate.xml, bkupInfo.xml and Manifest.xml to associate the production GPO with the new path.

image

The AGPMserv.log has a very limited view of the process, simply recording a GPO Check-In “CheckInGPO()” function was called.

image

The Domain Controller

We’ve already covered the network traffic between the AGPM Client and Server and the Domain controller, so let’s move on to the Process Monitor output. Similar to the activity during the Check-Out operation, lsass.exe is accessing the Active Directory database, pulling the GPO information from the corresponding GP Container.

The security event log should have events correlating to the removal of the Offline GPO. Look for Event ID: 5136.

In Closing

In this third installment, I covered part of a procedure repeated every time there’s need to modify a GPO within AGPM. To rehash from Part 2 of this blog series, during the Check-Out of a GPO, the following steps are performed:

The Archive copy of the GPO is copied to a temp folder.

  • From the duplicated Archive data, a new “Offline” GPO is created with the [AGPM] prefix by performing a GPO Restore.
  • The GPO’s entry within the Archive’s gpostate.xml file is updated to reflect its checked-out state, and references the newly created “Offline” GPO.
  • Once the Check-Out procedure is complete, the temp copy of the Archive data is deleted.
  • The “Offline” GPO is not linked anywhere, and edits to it are made in real-time.

During the Check-In process, we have observed the following:

A new Archive path is created with a new GUID

  • A GPO Backup is performed of the “Offline” GPO to the newly created Archive path
  • The “Offline” GPO is deleted
  • Gpostate.xml, bkupInfo.xml and Manifest.xml are updated to reflect the new association between the originally Checked-Out GPO and the new Archive path

From this information, we can make a few important connections: any changes made to an AGPM-controlled GPO outside of the AGPM console (i.e. the rogue Domain Admin that doesn’t bother with the AGPM console, and edits the GPO directly through GPMC.msc) are overwritten the next time the GPO is deployed from the AGPM console. Since the Check-Out procedure builds the editable “Offline” GPO from the AGPM Archive data, the Admin’s changes are not included automatically. We do have the option of using the “Import from…” feature to pull the settings from the production GPO again prior to the Check-Out, which updates the Archive data with any changes made outside of AGPM. As mentioned earlier, the Check-In operation does NOT commit the changes to the production GPO. We must follow the Check-In operation with a “Deploy” in order to have our changes released to production.

Complete series

http://blogs.technet.com/b/askds/archive/2011/01/31/agpm-production-gpos-under-the-hood.aspx
http://blogs.technet.com/b/askds/archive/2011/04/04/agpm-operations-under-the-hood-part-2-check-out.aspx
http://blogs.technet.com/b/askds/archive/2011/04/11/agpm-operations-under-the-hood-part-3-check-in.aspx
http://blogs.technet.com/b/askds/archive/2011/04/26/agpm-operations-under-the-hood-part-4-import-and-export.aspx

Sean "my head will not shift when stored in the overhead compartment" Wright

AGPM Operations (under the hood part 4: import and export)

$
0
0

Sean again, here for Part 4 of the Advanced Group Policy Management (AGPM) blog series, following the lifecycle of a Group Policy Object (GPO) as it transitions through various events. In this installment, we investigate what takes place when you use the Import and Export features within AGPM.

With the use of Group Policy so common in today’s Active Directory environments, there may be a need to create new GPOs with a baseline of common settings already in place. Taking GPOs from one domain and creating an identical GPO in another domain or forest may be required. Having a backup copy of a GPO to keep elsewhere for disaster recovery is always handy. Using the Import and Export features of AGPM, an admin can accomplish all of these.

In Part 1 of this series (Link), we introduced AGPM, and followed an uncontrolled, or “Production” GPO through the process of taking control of it with the AGPM component of the Group Policy Management Console (GPMC). If you are unfamiliar with AGPM, I would recommend you refer to the first installment of this series before continuing on.

Part 2 of the series (Link) continued the analysis of this GPO as it was Checked-Out using AGPM. We revealed the link between AGPM controlled GPOs and the AGPM Archive as well as how AGPM provides for offline editing of GPOs.

With Part 3 of the series (Link), we picked things back up with our checked out GPO and checked it back in. Our analysis of the process pointed out how AGPM keeps previous Archive folders, and how it maintains the historic link between the managed GPO and each of its previous iterations.

Environment Overview:

The environment has three computers: a domain controller, a member server, and a client.

  • CONDC1 : Windows Server 2008 R2 Domain Controller
  • CONAGPM : Windows Server 2008 R2 AGPM Server
  • CONW71 : Windows 7 AGPM Client

For additional information regarding the environment and tools used below, please refer to Part 1 of this series (Link).

Before We Begin:

Since the Export function is very straightforward, it doesn’t warrant an entire blog post.  As such, let’s go over it quickly here, to summarize what takes place during an Export before we move on to looking at the Import function.

The AGPM Client and Server are the only two involved in the Export operation.  The client sends the instructions to the AGPM Server, which calls the “ExportGpoToFile()” function as shown below.

image

image

The information from the Archive folder is copied into temp folders within the AGPM Archive before being written into the .cab file.  The contents of the .cab file depend on the settings within the GPO.  For example, if the GPO has any scripts configured, the script file itself will be included along with a scripts.ini file containing options for the script execution.  Registry settings will be included in a registry.pol file.  Drive mapping preference settings will cause a drives.xml file to be included, and so on.

Once the .cab file is created within the AGPM Archive temp folder, it is copied over to the desired destination folder on the AGPM Client.

image

Now that we have that out of the way, let’s move on to the focus of this blog post. The Import!

Getting Started:

We start on our Windows 7 computer logged in as our AGPM Administrator account (AGPMAdmin). We will need GPMC open, and viewing the Change Control section, which is the AGPM console. We’ll be using the “Dev Client Settings” GPO from the previous blog post, so let’s review the GPO details.

  • The GPO GUID : {01D5025A-5867-4A52-8694-71EC3AC8A8D9}
  • The GPO Owner : Domain Admins (CONTOSO\Domain Admins)
  • The Delegation list : AGPM Svc, Authenticated Users, Domain Admins, Enterprise Admins, ENTERPRISE DOMAIN CONTROLLERS and SYSTEM
  • Current ArchiveID : {1946BF4D-6AA9-47C7-9D09-C8788F140F7E}

If you’re familiar with the previous entries in this blog series, you may notice a new entry above. The ArchiveID value is simply the current GUID assigned to the backup of this GPO in the AGPM Archive. It’s included here because we will observe the activity within the AGPM archive caused by the Import and Export functions.

Before we begin, we log into the AGPM Server and the Domain Controller and start the usual data capture tools discussed previously. Right clicking the Checked-In GPO displays the context sensitive menu and we see both the “Import from…” and “Export to…” items on the list. Mousing over the “Import from…” selection, we get a slide-out menu that has “Production” and “File”. Notice the grayed out “File” option below; checked in GPO files can’t be imported.

image

For our first test, we select the option to import from production. We are prompted to enter a comment when logged in as an AGPM Administrator or AGPM Editor. It’s always a good idea to provide some context to the action. Since AGPM keeps a history of GPOs it manages, use the comments to keep track of ‘why’ you performed certain actions.

The GPO Import progress dialog tells us when the operation is complete. Clicking the “Close” button brings us back to the AGPM Console. Let’s look at the data we’ve captured to see what really happened.

The AGPM Client

Similar to the Network Monitor analysis of our previous entries in this blog series, we see a small amount of traffic to TCP port 4600 on the AGPM Server.

image

The AGPM log shows the same block of information we’ve seen in every other data capture in this blog series. The AGPM client begins the AgpmClient.ProcessMessages() function, connects to and notifies the server of incoming operation requests, sends the commands over and receives the server response.

image

The AGPM Server

Network traffic from the AGPM Client was covered above, so we’ll focus on what’s going on between the AGPM Server and the Domain Controller. SMB2 traffic shows the AGPM Server reading the GPO information from SYSVOL.

image

image

There is a significant amount of traffic between the AGPM Server and the Domain Controller on TCP port 389 (LDAP), which would be the AGPM Server reading the GPO information from Active Directory.

We retrieve the AGPM Archive registry path and access gpostate.xml for the GPO’s information.

image

I mentioned the ArchiveID value for this GPO earlier. The following screenshot is from gpostate.xmlBEFORE the Import.

image

Next, we read the manifest.xml file. The following screenshot is from BEFORE the Import.

image

Once AGPM has verified the current information on the GPO, it reads the GPO information from the Domain Controller and writes it into the AGPM Archive.

image

image

Notice how the GUID in the Archive path is different? AGPM creates a new ArchiveID/GUID to store the GPO data. The Backup.xml, bkupInfo.xml and overall Manifest.xml files are updated with the new Archive ID information.

Finally, we update the gpostate.xml with the new information, as shown here. Notice the original Archive path GUID moves to the second <History> entry now.

image

image

The GPMC log shows some elements familiar to those of you who have read the previous entries in this blog post. GPMC performs a GPO Backup routine, pulling data from the production GPO and storing it in the newly created AGPM Archive path.

image

The AGPMserv.log shows the typical block of messages related to receiving, processing and responding to the AGPM Client.

image

The Domain Controller

We’ve already covered network traffic between the three systems, and Process Monitor shows events we would expect on any Domain Controller.

The security event log shows a number of Object Access entries, where the AGPM service account is used to read properties from AD objects. This is AGPM reading the GPO information out of Active Directory.

image

In Closing

This fourth entry in the AGPM Operations series covers the import of group policy settings from a Production GPO. Specifically, we covered importing the production GPO settings into an existing, AGPM controlled GPO.

  • The AGPM Archive folder for a controlled GPO is linked to its Production GPO in the gpostate.xml file.
  • The Import from Production process utilizes a GPO Backup, storing the settings in a newly created Archive folder.
  • The previous Archive folder is maintained for rollback/historic purposes
  • The gpostate.xml file references both the current Archive folder GUID as well as the previous versions’.

Another method exists for importing settings into AGPM Controlled GPOs. The Export of a GPO within the AGPM console creates a .cab file with all files and settings associated with that GPO contained within. The Import from File features uses these .cab files to import settings into new or existing GPOs within AGPM in the same domain, or foreign domains as well. Whereas the Import from Production feature only works with existing AGPM Controlled GPOs, when creating a new GPO within the AGPM console, you can opt to import the settings directly from an exported GPO’s .cab file. From our observations here, we can deduce the newly created GPO is created with a new AGPM Archive folder and an entirely new entry in gpostate.xml. Unlike the Import from Production we investigated above, the information used to create the new GPO is sourced directly from the .cab file, instead of querying the Domain Controller.

Complete series

http://blogs.technet.com/b/askds/archive/2011/01/31/agpm-production-gpos-under-the-hood.aspx
http://blogs.technet.com/b/askds/archive/2011/04/04/agpm-operations-under-the-hood-part-2-check-out.aspx
http://blogs.technet.com/b/askds/archive/2011/04/11/agpm-operations-under-the-hood-part-3-check-in.aspx
http://blogs.technet.com/b/askds/archive/2011/04/26/agpm-operations-under-the-hood-part-4-import-and-export.aspx

Sean "two wrongs don't make a" Wright

Forcing Domain Admins to use AGPM (but not really)

$
0
0

Hi folks, Sean Wright here for my final post. So, you have AGPM installed, but your Domain Admins continue using GPMC to create, delete, and modify Group Policy. You’ve asked nicely, but that hasn’t had much effect. Now you want to make your point, and prevent your Domain Admins from managing Group Policy the wrong way. You decide to deny Domain Administrators the rights to modify Group Policy Objects (GPOs) through any means save the AGPM console. It may seem like a good idea, but let me explain how your time is better spent elsewhere.

First, let’s cover the concept of a domain administrator. The domain admin is the most trusted and unrestricted user account in the domain. The domain admin can do anything in the domain and can give themselves permissions that make anything possible. The domain admin is the "Domain Overlord" if you will. Go ahead, laugh maniacally now, I’ll wait.

The very notion that you want to deny something to a Domain Admin is a foreign concept. You don’t deny them anything. They deny rights to others. Windows and Active Directory are built upon this fundamental concept, which brings us to our next section.

Why you’re wasting your time:

Active Directory is tailored to Domain Admins being all-powerful. No matter what you do to restrict their rights, they can simply change it back at will. You can make it difficult, which might discourage them… but a determined admin can undo anything you change.

You now have a new admin on the team, and during his troubleshooting “Random Group Policy Problem #5”, they receive an access denied error when managing policy through GPMC. They should be using AGPM, and the fact that they are unaware of this is a whole other issue. Most admins take access denied errors as a bad thing-- after all, they are an admin; so, they may start fixing the environment by changing permissions.

If you contact Microsoft Support for a Group Policy related issue, we will likely return the permissions to defaults before proceeding with troubleshooting. We do not recommend this scenario, because you can't prevent a domain administrator from being an domain administrator, and your efforts can be so easily undone.

If you modify permissions on policy folders within SYSVOL, you’re going to trigger replication for every file and folder that is changed. In large environments with many policies, that can be a significant network traffic surge.

Most importantly, Microsoft has not tested this scenario, so you may introduce unforeseen problems to your environment by attempting it.

What you should do instead:

The advice I give to every customer who force domain admins into AGPM is Education. You can’t prevent a domain admin from doing something if they are determined. If you can’t trust your domain admins to do the right thing, and do it the right way, then they should not be a domain admins. That said, I suggest educating administrators by teaching them about AGPM and its benefits. Explain why they should only use AGPM manage policy, and you will likely see them consciously decide to go the extra mile to do things the correct way.

Recently, I had a customer insist AGPM was incomplete, because it did not have this restrictive functionality built-in. The developers did not intend for AGPM to restrict admins. It was designed to provide benefits that make troubleshooting and administration of policy more manageable.

If you’re still reading, and are determined to try this in spite of my recommendations against it:

Editing existing Group Policy object

During installation, in an effort to make things easier, some customers simply add the AGPM service account to the Domain Administrators group. Since we’re about to prevent domain admins from accessing production GPOs, you’ll want to read over the AGPM Least Privilege scenario and make sure you have successfully implemented this before you proceed.

1. We’ll need to remove any Administrative users or groups from the “Group Policy Creator Owners” group. You can do this through Active Directory Users and Computers.

2. If it’s not already there, make sure you add the AGPM service account to ”Group Policy Creator Owners”

3. Open the Group Policy Management Console (GPMC.msc) and find the Group Policy Objects container. The Delegation tab shows a list of users/groups that have the ability to create new GPOs in the domain. You can try to remove Domain Admins from this location, but alas, it won’t let you.

image

Note: This is a safety feature, designed to prevent you from accidentally removing all rights to create GPOs.

What you can do, is prevent your domain admins from editing the existing GPOs.

4. Within GPMC, expand the Group Policy Objects container and find the Default Domain Controllers Policy.

5. Select the Default Domain Controllers GPO and go to the Delegation tab.

6. Remove the Domain Administrators and Enterprise Administrators groups from the delegation list.

7. Make sure the list contains SYSTEM with full control, and ENTERPRISE DOMAIN CONTROLLERS and the Authenticated Users entries with Read permissions (at least).

8. Repeat steps 5 through 7 for every GPO currently in your environment.

image

This makes your existing GPOs resistant (but not immune) to your administrator’s editorial charms.

9. Next, open GPMC with your AGPM Administrator account and go to the AGPM console.

10. Click on the Production Delegation tab and remove Domain Administrators and Enterprise Administrators from this location. This tab within the AGPM console determines the permissions AGPM assigns to controlled GPOs when they are deployed to production using AGPM. Making this change prevents all of the hard work you just did in the section above from going to waste.

image

Don’t worry that the list isn’t complete. We need to add Authenticated Users and the AGPM service account to production GPOs.

Control Group Policy object links

So far, we’ve removed the domain admin’s ability to edit existing GPOs, but they can still create new GPOs and link new and existing GPOs to OUs. In order to prevent these actions, we need to explicitly deny specific rights related to Group Policy.

1. Open GPMC and click on the domain node that contains the name of your domain.

2. Click Delegation and click Advanced.

3. On the domain’s security dialog box, click Advanced to open the Advanced Security Settings dialog.

4. Click Add button to add a new entry.

5. Type Domain Admins and then click Check Names. Click OK to show the Permission Entry dialog.

6. Click Properties.

7. Select the Deny check box next to the permissions Write gPLink and Write gPOptions.

8. Click OK on all dialogs until you return to GPMC.

image

9. Check the permissions by right-clicking the node with the name of the domain. Notice the menu items Create a GPO in this domain, and Link it here… ; Link an Existing GPO… ; and Block Inheritance are unavailable. Additionally, the menu items Enforced and Link Enabled are unavailable on existing GPO links.

image

10. You will need to repeat steps 1-9 for every OU in your domain. This change is also needed for any newly created OUs. It might seem easier to set these deny permissions at the domain level and let inheritance propagate the settings down to existing and new OUs, it doesn’t work. When an OU is created in Active Directory, permissions are explicitly defined at the OU level. When you set an explicit deny permission at the domain level, inheritance applies an implicit deny at the OU level. An explicit deny wins over an explicit allow; however, an explicit allow wins over an implicit deny.

Note: There is also an option to change the default permissions applied to new OUs as they are created. This option modifies the schema, so use caution when modifying any value in the schema. The defaultSecurityDescriptor attribute is in SDDL format, so I recommend you configure one OU with the correct security settings and copy the value. This prevents having to manually set the permissions as new OUs are created in the future.

No new GPOs for You

So far, we removed the domain admin’s right to edit existing GPOs, and their rights to link new GPOs to existing OUs in the domain. Also, we removed their right to edit the GPOptions such as link and enforced states. The last step is to prevent a domain admin from creating new GPOs in the domain’s Group Policy Objects container.

1. Open ADSIEdit.msc. Right-click the ADSI Edit node in the navigation pane and then click Connect to…

2. Configure the Connections Settings dialog similar to the following image. Click OK.

image

3. In the navigation pane, expand the Default naming context until you find the following container: CN=Policies,CN=System,DC=domain,DC=com.

4. Right-click CN=Policies and then click Properties.Click the Security tab.

5. Click Advanced to open the Advanced Security Settings dialog. Add an entry for Domain Admins,and deny the permission to create or delete groupPolicyContainer objects.

image

This last step makes Create menu item unavailable within GPMC when creating a new Group Policy Objects. The Delete menu item remains available for GPOs; however, attempting a delete results with an access denied error.

An Imperfect Solution:

Many aspects of this scenario require periodic administrative attention that certainly increases management costs.. In addition, domain admins can undo this partially or in total. This can increases the difficultly that comes with troubleshooting.

Group Policy was designed to be managed by domain administrators. Attempting to hack a solution can cause its fair share of administrative burden (even when it’s working correctly). Why? Because any domain admin can undo the solution with relative ease, making it a monumental waste of time and provides a false sense of security. Since Microsoft does not recommend this scenario, we advise everyone to use AGPM as a beneficial tool and educate your staff. When they are familiar with it, and have it as readily available as GPMC, they will be more likely to do the right thing by using the AGPM to manage GPOs.

And the real solution? Have some consequences when admins choose not to use AGPM. That will straighten people out in a hurry. If your domain admin can't follow simples rules, like use AGPM, the imagine what other dangers lurk behind your back.

Sean "Don't Taz Me Bro" Wright

[Editor’s note: this was Sean’s last post – he left us for greener pastures last week. Good luck man, I hope you can get a chuckle out of your new colleagues with your famous photoshopping – Ned]

AskDS is 12,614,400,000,000,000 shakes old

$
0
0

It’s been four years and 591 posts since AskDS reached critical mass. You’d hope our party would look like this: 

image

But it’s more likely to be:

image

Without you, we’d be another of those sites that glow red hot, go supernova, then collapse into a white dwarf. We really appreciate your comments, questions, and occasional attaboys. Hopefully we’re good for another year of insightful commentary.

Thanks readers.

The AskDS Contributors

Troubleshoot ADFS 2.0 with these new articles

$
0
0

Hi all, here’s a quick public service announcement to highlight some recently published ADFS 2.0 troubleshooting guidance. We get a lot of questions about configuring and troubleshooting ADFS 2.0, so our support and content teams have pitched in to create a series of troubleshooting articles to cover the most common scenarios.

ADFS 2.0 connectivity problems: “This page cannot be displayed” – You receive a “This page cannot be displayed” error message when you try to access an application on a website that uses AD FS 2.0. Provides a resolution.

ADFS 2.0 ADFS service configuration and startup issues-ADFS service won’t start – Provides troubleshooting steps for ADFS service configuration and startup problems.

ADFS 2.0 Certificate problems-An error occurred during an attempt to build the certificate chain – A certificate-related change in AD FS 2.0 causes certificate, SSL, and trust errors triggers errors including Event 133. Provides a resolution.

ADFS 2.0 authentication problems: “Not Authorized HTTP error 401″ – You cannot authenticate an account in AD FS 2.0, that you are prompted for credentials, and event 111 is logged. Provides a resolution.

ADFS 2.0 claims rules problems: “Access is denied” – You receive an “Access Denied” error message when you try to access an application in AD FS 2.0. Provides a resolution.

We hope you will find these troubleshooters useful. You can provide feedback and comments at the bottom of each KB if you want to help us improve them.

Windows 10 Group Policy (.ADMX) Templates now available for download

$
0
0

Hi everyone, Ajay here.  I wanted to let you all know that we have released the Windows 10 Group Policy (.ADMX) templates on our download center as an MSI installer package. These .ADMX templates are released as a separate download package so you can manage group policy for Windows 10 clients more easily.

This new package includes additional (.ADMX) templates which are not included in the RTM version of Windows 10.

 

  1. DeliveryOptimization.admx
  2. fileservervssagent.admx
  3. gamedvr.admx
  4. grouppolicypreferences.admx
  5. grouppolicy-server.admx
  6. mmcsnapins2.admx
  7. terminalserver-server.admx
  8. textinput.admx
  9. userdatabackup.admx
  10. windowsserver.admx

To download the Windows 10 Group Policy (.ADMX) templates, please visit http://www.microsoft.com/en-us/download/details.aspx?id=48257

To review which settings are new in Windows 10, review the Windows 10 ADMX spreadsheet here: http://www.microsoft.com/en-us/download/details.aspx?id=25250

Ajay Sarkaria


Manage Developer Mode on Windows 10 using Group Policy

$
0
0

Hi All,

We’ve had a few folks want to know how to disable Developer Mode using Group Policy, but still allow side-loaded apps to be installed.  Here is a quick note how to do this. (A more AD-centric post from Linda Taylor is on it way)

On the Windows 10 device, click on Windows logo key‌ clip_image001 and then click on Settings.

clip_image002

Click on Update & Security

clip_image003

From the left-side pane, select For developers and from the right-side pane, choose the level that you need.

clip_image004

· If you choose Sideload apps: You can install an .appx and any certificate that is needed to run the app with the PowerShell script that is created with the package. Or you can use manual steps to install the certificate and package separately.

· If you choose Developer mode: You can debug your apps on that device. You can also sideload any apps if you choose developer mode, even ones that you have not developed on the device. You just have to install the .appx with its certificate for sideloading.

Use Group Policy Editor (gpedit) to enable your device:

Using Group Policy Editor (gpedit.msc), a developer mode can be enabled or disabled on computers running Windows 10.

1. Open the Windows Run box using keyboard, press Windows logo key‌  +R

2. Type in gpedit.msc and then press Enter.

3. In Group Policy Editor navigate to Computer Configuration\Administrative Templates\Windows Components\App Package Deployment.

4. From the right-side pane, double click on Allow all trusted apps to install and click on Enabled button.

5. Click on Apply and then OK .

Notes:

· Allow all trusted apps to install

o If you want to disable access to everything in for developers’ disable this policy setting.

o If you enable this policy setting, you can install any LOB or developer-signed Windows Store app.

If you want to allow side-loading apps to install but disable the other options in developer mode disable "Developer mode" and enable "Allow all trusted apps to install"

· Group policies are applied every 90 minutes, plus or minus a random amount up to 30 minutes. To apply the policy immediately, run gpupdate from the command prompt.

For more information on Developer Mode, see the following MSDN article:
https://msdn.microsoft.com/library/windows/apps/xaml/dn706236.aspx?f=255&MSPPError=-2147217396

SHA1 Key Migration to SHA256 for a two tier PKI hierarchy

$
0
0

Hello. Jim here again to take you through the migration steps for moving your two tier PKI hierarchy from SHA1 to SHA256. I will not be explaining the differences between the two or the supportability / security implementations of either. That information is readily available, easily discoverable and is referenced in the links provided below. Please note the following:

Server Authentication certificates: CAs must begin issuing new certificates using only the SHA-2 algorithm after January 1, 2016. Windows will no longer trust certificates signed with SHA-1 after January 1, 2017.

If your organization uses its own PKI hierarchy (you do not purchase certificates from a third-party), you will not be affected by the SHA1 deprecation. Microsoft's SHA1 deprecation plan ONLY APPLIES to certificates issued by members of the Microsoft Trusted Root Certificate program.  Your internal PKI hierarchy may continue to use SHA1; however, it is a security risk and diligence should be taken to move to SHA256 as soon as possible.

In this post, I will be following the steps documented here with some modifications: Migrating a Certification Authority Key from a Cryptographic Service Provider (CSP) to a Key Storage Provider (KSP) -https://technet.microsoft.com/en-us/library/dn771627.aspx

The steps that follow in this blog will match the steps in the TechNet article above with the addition of screenshots and additional information that the TechNet article lacks.

Additional recommended reading:

The following blog written by Robert Greene will also be referenced and should be reviewed – http://blogs.technet.com/b/askds/archive/2015/04/01/migrating-your-certification-authority-hashing-algorithm-from-sha1-to-sha2.aspx

This Wiki article written by Roger Grimes should also be reviewed as well – http://social.technet.microsoft.com/wiki/contents/articles/31296.implementing-sha-2-in-active-directory-certificate-services.aspx

Microsoft Trusted Root Certificate: Program Requirements – https://technet.microsoft.com/en-us/library/cc751157.aspx

The scenario for this exercise is as follows:

A two tier PKI hierarchy consisting of an Offline ROOT and an Online subordinate enterprise issuing CA.

Operating Systems:
Offline ROOT and Online subordinate are both Windows 2008 R2 SP1

OFFLINE ROOT
CANAME – CONTOSOROOT-CA

clip_image001

ONLINE SUBORDINATE ISSUING CA
CANAME – ContosoSUB-CA

clip_image003

First, you should verify whether your CA is using a Cryptographic Service Provider (CSP) or Key Storage Provider (KSP). This will determine whether you have to go through all the steps or just skip to changing the CA hash algorithm to SHA2. The command for this is in step 3. The line to take note of in the output of this command is “Provider =”. If the Provider = line is any of the top five service providers highlighted below, the CA is using a CSP and you must do the conversion steps. The RSA#Microsoft Software Key Storage Provider and everything below it are KSP’s.

clip_image005

Here is sample output of the command – Certutil –store my <Your CA common name>

As you can see, the provider is a CSP.

clip_image006

If you are using a Hardware Storage Module (HSM) you should contact your HSM vendor for special guidance on migrating from a CSP to a KSP. The steps for changing the Hashing algorithm to a SHA2 algorithm would still be the same for HSM based CA’s.

There are some customers that use their HSM for the CA private / public key, but use Microsoft CSP’s for the Encryption CSP (used for the CA Exchange certificate).

We will begin at the OFFLINE ROOT.

BACKUP! BACKUP! BACKUP the CA and Private KEY of both the OFFLINE ROOT and Online issuing CA. If you have more than one CA Certificate (you have renewed multiple times), all of them will need to be backed up.

Use the MMC to backup the private key or use the CERTSRV.msc and right click the CA name to backup as follows on both the online subordinate issuing and the OFFLINE ROOT CA’s –

clip_image008

clip_image010

Provide a password for the private key file.

clip_image012

You may also backup the registry location as indicated in step 1C.

Step 2– Stop the CA Service

Step 3- This command was discussed earlier to determine the provider.

  • Certutil –store my <Your CA common name>

Step 4 and Step 6 from the above referenced TechNet articleshould be done via the UI.

a. Open the MMC – load the Certificates snapin for the LOCAL COMPUTER

b. Right click each CA certificate (If you have more than 1) – export

c. Yes, export the private key

d. Check – Include all certificates in the certification path if possible

e. Check – Delete the private key if the export is successful

clip_image014

f. Click next and continue with the export.

Step 5
Copy the resultant .pfx file to a Windows 8 or Windows Server 2012 computer

Conversion requires a Windows Server 2012 certutil.exe, as Windows Server 2008 (and prior) do not support the necessary KSP conversion commands. If you want to convert a CA certificate on an ADCS version prior to Windows Server 2012, you must export the CA certificate off of the CA, import onto Windows Server 2012 or later using certutil.exe with the -KSP option, then export the newly signed certificate as a PFX file, and re-import on the original server.

Run the command in Step 5 on the Windows 8 or Windows Server 2012 computer.

  • Certutil –csp <KSP name> -importpfx <Your CA cert/key PFX file>

clip_image016

Step 6

a. To be done on the Windows 8 or Windows Server 2012 computer as previously indicated using the MMC.

b. Open the MMC – load the Certificates snapin for the LOCAL COMPUTER

c. Right click the CA certificate you just imported – All Tasks – export

*I have seen an issue where the “Yes, export the private key” is dimmed after running the conversion command and trying to export via the MMC. If you encounter this behavior, simply reimport the .PFX file manually and check the box Mark this key as exportable during the import. This will not affect the previous conversion.

d. Yes, export the private key.

e. Check – Include all certificates in the certification path if possible

f. Check – Delete the private key if the export is successful

g. Click next and continue with the export.

h. Copy the resultant .pfx file back to the destination 2008 R2 ROOTCA

Step 7

You can again use the UI (MMC) to import the .pfx back to the computer store on the ROOTCA

*Don’t forget during the import to Mark this key as exportable.

clip_image018

***IMPORTANT***

If you have renewed you CA multiple times with the same key, after exporting the first CA certificate as indicated above in step 4 and step 6, you are breaking the private key association with the previously renewed CA certificates.  This is because you are deleting the private key upon successful export.  After doing the conversion and importing the resultant .pfx file on the CA (remembering to mark the private key as exportable), you must run the following command from an elevated command prompt for each of the additional CA certificates that were renewed previously:

certutil –repairstore MY serialnumber 

The Serial number is found on the details tab of the CA certificate.  This will repair the association of the public certificate to the private key.


Step 8

Your CSP.reg file must contain the information highlighted at the top –

clip_image020

Step 8c

clip_image022

Step 8d– Run CSP.reg

Step 9

Your EncryptionCSP.reg file must contain the information highlighted at the top –

clip_image024

Step 9c– verification – certutil -v -getreg ca\encryptioncsp\EncryptionAlgorithm

Step 9d– Run EncryptionCsp.reg

Step 10

Change the CA hash algorithm to SHA256

clip_image026

Start the CA Service

Step 11

For a root CA: You will not see the migration take effect for the CA certificate itself until you complete the migration of the root CA, and then renew the certificate for the root CA.

Before we renew the OFFLINE ROOT certificate this is how it looks:

clip_image028

Renewing the CA’s own certificate with a new or existing (same) key would depend on the remaining validity of the certificate. If the certificate is at or nearing 50% of its lifetime, it would be a good idea to renew with a new key. See the following for additional information on CA certificate renewal –

https://technet.microsoft.com/en-us/library/cc730605.aspx

After we renew the OFFLINE ROOT certificate with a new key or the same key, its own Certificate will be signed with the SHA256 signature as indicated in the screenshot below:

clip_image030

Your OFFLINE ROOT CA is now completely configured for SHA256.

Running CERTUTIL –CRL will generate a new CRL file also signed using SHA256

clip_image032

By default, CRT, CRL and delta CRL files are published on the CA in the following location – %SystemRoot%\System32\CertSrv\CertEnroll. The format of the CRL file name is the "sanitized name" of the CA plus, in parentheses, the "key id" of the CA (if the CA certificate has been renewed with a new key) and a .CRL extension. See the following for more information on CRL distribution points and the CRL file name – https://technet.microsoft.com/en-us/library/cc782162%28v=ws.10%29.aspx

Copy this new .CRL file to a domain joined computer and publish it to Active Directory while logged on as an Enterprise Administrator from an elevated command prompt.

Do the same for the new SHA256 ROOT CA certificate.

  • certutil -f -dspublish <.CRT file> RootCA
  • certutil –f -dspublish <.CRL file>

Now continue with the migration of the Online Issuing Subordinate CA.

Step 1– Backup the CA database and Private Key.

Backup the CA registry settings

Step 2– Stop the CA Service.

Step 3- Get the details of your CA certificates

Certutil –store my “Your SubCA name”

image

I have never renewed the Subordinate CA certificate so there is only one.

Step 4 – 6

As you know from what was previously accomplished with the OFFLINE ROOT, steps 4-6 are done via the MMC and we must do the conversion on a Windows 8 or Windows 2012 or later computer for reasons explained earlier.

clip_image035

*When you import the converted SUBCA .pfx file via the MMC, you must remember to again Mark this key as exportable.

Step 8 – Step 9

Creating and importing the registry files for CSP and CSP Encryption (see above)

Step 10- Change the CA hash algorithm to SHA-2

clip_image037

Now in the screenshot below you can see the Hash Algorithm is SHA256.

clip_image039

The Subordinate CA’s own certificate is still SHA1. In order to change this to SHA256 you must renew the Subordinate CA’s certificate. When you renew the Subordinate CA’s certificate it will be signed with SHA256. This is because we previously changed the hash algorithm on the OFFLINE ROOT to SHA256.

Renew the Subordinate CA’s certificate following the proper steps for creating the request and submitting it to the OFFLINE ROOT. Information on whether to renew with a new key or the same key was provided earlier. Then you will copy the resultant .CER file back to the Subordinate CA and install it via the Certification Authority management interface.

If you receive the following error when installing the new CA certificate –

clip_image041

Check the newly procured Subordinate CA certificate via the MMC. On the certification path tab, it will indicate under certificate status that – “The signature of the certificate cannot be verified”

This error could have several causes. You did not –dspublish the new OFFLINE ROOT .CRT file and .CRL file to Active Directory as previously instructed.

clip_image043

Or you did publish the Root CA certificate but the Subordinate CA has not done Autoenrollment (AE) yet and therefore has not downloaded the “NEW” Root CA certificate via AE methods, or AE may be disabled on the CA all together.

After the files are published to AD and after verification of AE and group policy updates on the Subordinate CA, the install and subsequent starting of Certificate Services will succeed.

Now in addition to the Hash Algorithm being SHA256 on the Subordinate CA, the Signature on its own certificate will also be SHA256.

clip_image045

The Subordinate CA’s .CRL files are also now signed with SHA256 –

clip_image047

Your migration to SHA256 on the Subordinate CA is now completed.

I hope you found this information helpful and informative. I hope it will make your SHA256 migration project planning and implementation less daunting.

Jim Tierney

“Administrative limit for this request was exceeded" Error from Active Directory

$
0
0

Hello, Ryan Ries here with my first AskDS post! I recently ran into an issue with a particular environment where Active Directory and UNIX systems were being integrated.  Microsoft has several attributes in AD to facilitate this, and one of those attributes is the memberUid attribute on security group objects.  You add user IDs to the memberUid attribute of the security group, and Active Directory will treat that as group membership from UNIX systems for the purposes of authentication/authorization.

All was well and good for a long time. The group grew and grew to over a thousand users, until one day we wanted to add another UNIX user, and we were greeted with this error:

“The administrative limit for this request was exceeded.”

Wait, there’s a limit on this attribute? I wonder what that limit is.

MSDN documentation states that the rangeUpper property of the memberUid attribute is 256,000. This support KB also mentions that:

“The attribute size limit for the memberUID attribute in the schema is 256,000 characters. It depends on the individual value length on how many user identifiers (UIDs) will fit into the attribute.”

And you can even see it for yourself if you fancy a gander at your schema:

Something doesn’t add up here – we’ve only added around 1200 users to the memberUid attribute of this security group. Sure it’s a big group, but that doesn’t exceed 256,000 characters; not even close. Adding up all the names that I’ve added to the attribute, I figure it adds up to somewhere around 10,000 characters. Not 256,000.

So what gives?

(If you’ve been following along and you’ve already figured out the problem yourself, then please contact us! We’re hiring!)

The problem here is that we’re hitting a different limit as we continue to add members to the memberUid attribute, way before we get to 256k characters.

The memberUid attribute is a multivalued attribute, however it is not a linked attribute.  This means that it has a limitation on its maximum size that is less than the 256,000 characters shown on the memberUid attributeSchema object.

You can distinguish between which attributes are linked or not based on whether those attributeSchema objects have values in their linkID attribute.

Example of a multivalued and linked attribute:

Example of a multivalued but not linked attribute:

So if the limit is not really 256,000 characters, then what is it?

From How the Data Store Works on TechNet:

“The maximum size of a database record is 8110 bytes, based on an 8-kilobyte (KB) page size. Because of variable overhead requirements and the variable number of attributes that an object might have, it is impossible to provide a precise limit for the maximum number of multivalues that an object can store in its attributes. …

The only value that can actually be computed is the maximum number of values in a nonlinked, multivalued attribute when the object has only one attribute (which is impossible). In Windows 2000 Active Directory, this number is computed at 1575 values. From this value, taking various overhead estimates into account and generalizing about the other values that the object might store, the practical limit for number of multivalues stored by an object is estimated at 800 nonlinked values per object across all attributes.

Attributes that represent links do not count in this value. For example, the members linked, multivalued attribute of a group object can store many thousands of values because the values are links only.

The practical limit of 800 nonlinked values per object is increased in Windows Server 2003 and later. When the forest has a functional level of Windows Server 2003 or higher, for a theoretical record that has only one attribute with the minimum of overhead, the maximum number of multivalues possible in one record is computed at 3937. Using similar estimates for overhead, a practical limit for nonlinked multivalues in one record is approximately 1200. These numbers are provided only to point out that the maximum size of an object is somewhat larger in Windows Server 2003 and later.”

(Emphasis is mine.)

Alright, so according to the above article, if I’m in an Active Directory domain running all Server 2003 or better, which I am, then a “practical” limit for non-linked multi-value attributes should be approximately 1200 values.

So let’s put that to the test, shall we?

I wrote a quick and dirty test script with PowerShell that would generate a random 8-character string from a pool of characters (i.e., a random fictitious user ID,) and then add that random user ID to the memberUid attribute of a security group, in a loop until the script encounters an error because the script can’t add any more values:

# This script is for testing purposes only!
$ValidChars = @('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D',
'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N',
'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
'Y', 'Z', '0', '1', '2', '3', '4', '5', '6', '7','8', '9')

[String]$Str = [String]::Empty
[Int]$Bytes = 0
[Int]$Uids = 0
While ($Uids -LT 1000000)
{
$Str = [String]::Empty
1..8 | % { $Str += ($ValidChars | Get-Random) }
Try
{
Set-ADGroup 'TestGroup' -Add @{ memberUid = $Str } -ErrorAction Stop
}
Catch
{
Write-Error $_.Exception.Message
Write-Host "$Bytes bytes $Uids users added"
Break
}
$Bytes += 8
$Uids += 1
}

Here’s the output from when I run the script:

Huh… whaddya’ know? Approximately 1200 users before we hit the “administrative limit,” just like the article suggests.

One way of getting around this attribute's maximum size would be to use nested groups, or to break the user IDs apart into two separate groups… although this may cause you to have to change some code on your UNIX systems. It’s typically not a fun day when you first realize this limit exists. Better to know about it beforehand.

Another attribute in Active Directory that could potentially hit a similar limit is the servicePrincipalName attribute, as you can read about in this AskPFEPlat article.

Until next time!

Ryan Ries

Using Repadmin with ADLDS and Lingering objects

$
0
0

 

Hi! Linda Taylor here from the UK Directory Services escalation team. This time on ADLDS, Repadmin, lingering objects and even PowerShell….

The other day a colleague was trying to remove a lingering object in ADLDS. He asked me about which repadmin syntax would work for ADLDS and it occurred to us both that all the documented examples we found for repadmin were only for AD DS.

So, here are some ADLDS specific examples of repadmin use.

For the purpose of this post I will be using 2 servers with ADLDS. Both servers belong to Root.contoso.com Domain and they replicate a partition called DC=Fabrikam.

    LDS1 runs ADLDS on port 50002.
    RootDC1 runs ADLDS on port 51995.

1. Who is replicating my partition?

If you have many servers in your replica set you may want to find out which ADLDS servers are replicating a specific partition. ….Yes! The AD PowerShell module works against ADLDS.

You just need to add the :port on the end of the servername.

One way to list which servers are replicating a specific application partition is to query the attribute msDs-MasteredBy on the respective partition. This attribute contains a list of NTDS server settings objects for the servers which replicate this partition.

You can do this with ADSIEDIT or ldp.exe or PowerShell or any other means.

Powershell Example: Use the Get-ADObject comandlet and I will target my command at localhost:51995.  (I am running this on RootDC1)

powershell_lindakup_ADLDS

Notice there are 2 NTDS Settings objects returned and servername is recorded as ServerName$ADLDSInstanceName.

So this tells me that according to localhost:51995 , DC=Fabrikam partition is replicated between Server LDS1$instance1 and server ROOTDC1$instance1.

2. REPADMIN for ADLDS

Generic rules and Tips:

  • For most commands the golden rule is to simply use the port inside the DSA_NAME or DSA_LIST parameters like lds1:50002 or lds1.contoso.com:50002. That’s it!

For example:

CMD

 

  • There are some things which do not apply to ADLDS. That is anything which involves FSMO’s like PDC and RID which ADLDS does not have or Global Catalog – again no such thing in ADLDS.
  • A very useful switch for ADLDS is the /homeserver switch:

Usually by default repadmin assumes you are working with AD and will use the locator or attempt to connect to local server on port 389 if this fails. However, for ADLDS the /Homeserver switch allows you to specify an ADLDS server:port.

For example, If you want to get replication status for all ADLDS servers in a configuration set (like for AD you would run repadmin /showrepl * /csv), for ADLDS you can run the following:

Repadmin /showrepl /homeserver:localhost:50002 * /csv >out.csv

Then you can open the OUT.CSV using something like Excel or even notepad and view a nice summary of the replication status for all servers. You can then sort this and chop it around to your liking.

The below explanation of HOMESERVER is taken from repadmin /listhelp output:

If the DSA_LIST argument is a resolvable server name (such as a DNS or WINS name) this will be used as the homeserver. If a non-resolvable parameter is used for the DSA_LIST, repadmin will use the locator to find a server to be used as the homeserver. If the locator does not find a server, repadmin will try the local box (port 389).

The /homeserver:[dns name] option is available to explicitly control home server selection.

This is especially useful when there are more than one forest or configuration set possible. For

example, the DSA_LIST command "fsmo_istg:site1" would target the locally joined domain's directory, so to target an AD/LDS instance, /homeserver:adldsinstance:50000 could be used to resolve the fsmo_istg to site1 defined in the ADAM configuration set on adldsinstance:50000 instead of the fsmo_istg to site1 defined in the locally joined domain.

Finally, a particular gotcha that can send you in the wrong troubleshooting direction is a LDAP 0x51 “server down” error which is returned if you forget to add the DSA_NAME and/or port to your repadmin command. Like this:

lindakup_CMD2_ADLDS

3. Lingering objects in ADLDS

Just like in AD, you can get lingering objects in AD LDS .The only difference being that there is no Global Catalog in ADLDS, and thus no lingering objects are possible in a Read Only partition.

EVENT ID 1988 or 2042:

If you bring an outdated instance (past TSL) back online In ADLDS you may see event 1988 as per http://support.microsoft.com/kb/870695/EN-US “Outdated Active Directory objects generate event ID 1988”.

On WS 2012 R2 you will see event 2042 telling you that it has been over TombStoneLifetime since you last replicated so replication is disabled.

What to do next?

First you want to check for lingering objects and remove if necessary.

1. To check for lingering objects you can use repadmin /removelingeringobjects with the /advisory_mode

My colleague Ian Farr or “Posh chap” as we call him, recently worked with a customer on such a case and put together a great PowerShell blog with a One-Liner for detecting and removing lingering objects from ADLDS with PowerShell. Check it out here:

http://blogs.technet.com/b/poshchap/archive/2014/05/09/one-liner-collect-ad-lds-lingering-object-advisory-mode-1946-events.aspx

Example event 1946:

Event1946

2.  Once you have detected any lingering objects and you have made a decision that you need to remove them, you can remove them using the same repadmin command as in Iain’s blog but without the advisory_mode.

Example command to remove lingering objects:

Repadmin /removelingeringobjects lds1:50002 8fc92fdd-e5ec-45fb-b7d3-120f9f9f192 DC=Fabrikam

Where Lds1:50002 is the LDS instance and port where to remove lingering objects

8fc92fdd-e5ec-45fb-b7d3-120f9f9f192 is DSA guid of a good LDS server/instance

DC=Fabrikam is the partition where to remove lingering objects

For each lingering object removed you will see event 1945.

Event1945

You can use Iain’s one-liner again to get a list of all the objects which were removed.

As a good practice you should also do the lingering object checks for the Configuration partition.

Once all lingering objects are removed replication can be re-enabled again and you can go down the pub…(maybe).

I hope this is useful.

Linda.

Speaking in Ciphers and other Enigmatic tongues…update!

$
0
0

Hi! Jim Tierney here again to talk to you about Cryptographic Algorithms, SCHANNEL and other bits of wonderment. My original post on the topic has gone through a rewrite to bring you up to date on recent changes in this space. 
So, your company purchases this new super awesome vulnerability and compliance management software suite, and they just ran a scan on your Windows Server 2008 domain controllers and lo! The software reports back that you have weak ciphers enabled, highlighted in RED, flashing, with that “you have failed” font, and including a link to the following Microsoft documentation –

KB245030 How to Restrict the Use of Certain Cryptographic Algorithms and Protocols in Schannel.dll:

http://support.microsoft.com/kb/245030/en-us

The report may look similar to this:

SSL Server Has SSLv2 Enabled Vulnerability port 3269/tcp over SSL

THREAT:
The Secure Socket Layer (SSL) protocol allows for secure communication between a client and a server.
There are known flaws in the SSLv2 protocol. A man-in-the-middle attacker can force the communication to a less secure level and then attempt to break the weak encryption. The attacker can also truncate encrypted messages.

SOLUTION:
Disable SSLv2.

Upon hearing this information, you fire up your browser and read the aforementioned KB 245030 top to bottom and RDP into your DC’s and begin checking the locations specified by the article. Much to your dismay you notice the locations specified in the article are not correct concerning your Windows 2008 R2 DC’s. On your 2008 R2 DC’s you see the following at this registry location
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL:

clip_image001

“Darn you Microsoft documentation!!!!!!” you scream aloud as you shake your fist in the general direction of Redmond, WA….

This is how it looks on a Windows 2003 Server:

clip_image002

Easy now…

The registry key’s and their content in Windows Server 2008, Windows 7, Windows Server 2008 R2, Windows 2012 and 2012 R2 look different from Windows Server 2003 and prior.

Here is the registry location on Windows 7 – 2012 R2 and its default contents:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel]
“EventLogging”=dword:00000001
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Ciphers]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\CipherSuites]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Hashes]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\KeyExchangeAlgorithms]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0\Client]
“DisabledByDefault”=dword:00000001

Allow me to explain the above content that is displayed in standard REGEDIT export format:

  • The Ciphers key should contain no values or subkeys
  • The CipherSuites key should contain no values or subkeys
  • The Hashes key should contain no values or subkeys
  • The KeyExchangeAlgorithms key should contain no values or subkeys
  • The Protocols key should contain the following sub-keys and value:
    Protocols
         SSL 2.0
            Client
                DisabledByDefault REG_DWORD 0x00000001 (value)

The following table lists the Windows SCHANNEL protocols and whether or not they are enabled or disabled by default in each operating system listed:

image

*Remember to install the following update if you plan on or are currently using SHA512 certificates:

SHA512 is disabled in Windows when you use TLS 1.2
http://support.microsoft.com/kb/2973337/EN-US

Similar to Windows Server 2003, these protocols can be disabled for the server or client architecture. Meaning that either the protocol can be omitted from the list of supported protocols included in the Client Hello when initiating an SSL connection, or it can be disabled on the server so that even if a client requests SSL 2.0 in a client hello, the server will not respond with that protocol.

The client and server subkeys designate each protocol. You can disable a protocol for either the client or the server, but disabling Ciphers, Hashes, or CipherSuites affects BOTH client and server sides. You would have to create the necessary subkeys beneath the Protocols key to achieve this.

For example:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0\Client]
“DisabledByDefault”=dword:00000001
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0\Server]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 3.0]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 3.0\Client]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 3.0\Server]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.0]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.0\Client]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.0\Server]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.1]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.1\Client]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.1\Server]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.2]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.2\Client]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.2\Server]

This is how it looks in the registry after they have been created:

clip_image005

Client SSL 2.0 is disabled by default on Windows Server 2008, 2008 R2, 2012 and 2012 R2.

This means the computer will not use SSL 2.0 to initiate a Client Hello.

So it looks like this in the registry:

clip_image006

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0\Client]
DisabledByDefault =dword:00000001

Just like Ciphers and KeyExchangeAlgorithms, Protocols can be enabled or disabled.
To disable other protocols, select which side of the conversation on which you want to disable the protocol, and add the “Enabled”=dword:00000000 value. The example below disables the SSL 2.0 for the server in addition to the SSL 2.0 for the client.

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0\Client]
DisabledByDefault =dword:00000001 <Default client disabled as I said earlier>

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0\Server]
Enabled =dword:00000000 <disables SSL 2.0 server side>

clip_image007

After this, you will need to reboot the server. You probably do not want to disable TLS settings. I just added them here for a visual reference.

***For Windows server 2008 R2, if you want to enable Server side TLS 1.1 and 1.2, you MUST create the registry entries as follows:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.1\Server]
DisabledByDefault =dword:00000000

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.2\Server]
DisabledByDefault =dword:00000000

So why would you go through all this trouble to disable protocols and such, anyway? Well, there may be a regulatory requirement that your company’s web servers should only support Federal Information Processing Standards (FIPS) 140-1/2 certified cryptographic algorithms and protocols. Currently, TLS is the only protocol that satisfies such a requirement. Luckily, enforcing this compliant behavior does not require you to manually modify registry settings as described above. You can enforce FIPS compliance via group policy as explained by the following:

The effects of enabling the “System cryptography: Use FIPS compliant algorithms for encryption, hashing, and signing” security setting in Windows XP and in later versions of Windowshttp://support.microsoft.com/kb/811833

The 811833 article talks specifically about the group policy setting below which by default is NOT defined –

Computer Configuration\ Windows Settings \Security Settings \Local Policies\ Security Options

clip_image008

The policy above when applied will modify the following registry locations and their value content.
Be advised that this FipsAlgorithmPolicy information is stored in different ways as well –

Windows 7/2008
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa\FipsAlgorithmPolicy]
“Enabled”=dword:00000000 <Default is disabled>


Windows 2003/XP
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa]
Fipsalgorithmpolicy =dword:00000000 <Default is disabled>

Enabling this group policy setting effectively disables everything except TLS.

More Examples
Let’s continue with more examples. A vulnerability report may also indicate the presence of other Ciphers it deems to be “weak”.

Below I have built a .reg file that when imported will disable the following Ciphers:

56-bit DES

40-bit RC4

Behold!

Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\AES 128]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\AES 256]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\DES 56]
“Enabled”=dword:00000000
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\NULL]
“Enabled”=dword:00000000
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\RC4 128/128]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\RC4 40/128]
“Enabled”=dword:00000000
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\RC4 56/128]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\Triple DES 168]

After importing these registry settings, you must reboot the server.

The vulnerability report might also mention that 40-bit DES is enabled, but that would be a false positive because Windows Server 2008 doesn’t support 40-bit DES at all. For example, you might see this in a vulnerability report:

Here is the list of weak SSL ciphers supported by the remote server:
Low Strength Ciphers (< 56-bit key)
SSLv3
EXP-ADH-DES-CBC-SHA Kx=DH(512) Au=None Enc=DES(40) Mac=SHA1 export

TLSv1
EXP-ADH-DES-CBC-SHA Kx=DH(512) Au=None Enc=DES(40) Mac=SHA1 export

If this is reported and it is necessary to get rid of these entries you can also disable the Diffie-Hellman Key Exchange algorithm (another components of the two cipher suites described above — designated with Kx=DH(512)).

To do this, make the following registry changes:

Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\KeyExchangeAlgorithms\Diffie-Hellman]
“Enabled”=dword:00000000

You have to create the sub-key Diffie-Hellman yourself. Make this change and reboot the server.

This step is NOT advised or required….I am offering it as an option to you to make the vulnerability scanning tool pass the test.

Keep in mind, also, that this will disable any cipher suite that relies upon Diffie-Hellman for key exchange.

You will probably not want to disable ANY cipher suites that rely on Diffie-Hellman. Secure communications such as IPSec and SSL both use Diffie-Hellman for key exchange. If you are running OpenVPN on a Linux/Unix server you are probably using Diffie-Hellman for key exchange. The point I am trying to make here is you should not have to disable the Diffie-Hellman Key Exchange algorithm to satisfy a vulnerability scan.

Advanced Ciphers have arrived!!!
Advanced ciphers were added to Windows 8.1 / Windows Server 2012 R2 computers by KB 2929781, released in April 2014 and again by monthly rollup KB 2919355, released in May 2014

Updated cipher suites were released as part of two fixes.

KB 2919355 for Windows 8.1 and Windows Server 2012 R2 computers

MS14-066 for Windows 7 and Windows 8 clients and Windows Server 2008 R2 and Windows Server 2012 Servers.

While these updates shipped new ciphers, the cipher suite priority ordering could not correctly be updated.

KB 3042058, released Tuesday, March 2015 is a follow up package to correct that issue. This is NOT applicable to 2008 (non R2)

You can set a preference list for which cipher suites the server will negotiate first with a client that supports them.

You can review this MSDN article on how to set the cipher suite prioritization list via GPO: http://msdn.microsoft.com/en-us/library/windows/desktop/bb870930(v=vs.85).aspx#adding__removing__and_prioritizing_cipher_suites

Default location and ordering of Cipher Suites:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Cryptography\Configuration\Local\SSL0010002

clip_image010

Location of Cipher Suite ordering that is modified by setting this group policy –

Computer Configuration\Administrative Templates\Network\SSL Configuration Settings\SSL Cipher Suite Order

clip_image012

When the SSL Cipher Suite Order group policy is modified and applied successfully it modifies the following location in the registry:

HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\Cryptography\Configuration\SSL0010002

The Group Policy would dictate the effective cipher suites. Once this policy is applied, the settings here take precedence over what is in the default location. The GPO should override anything else configured on the computer. The Microsoft Schannel team does not support directly manipulating the registry.

Group Policy settings are domain settings configured by a domain administrator and should always have precedence over local settings configured by local administrators.

Being secure is a good thing and depending on your environment, it may be necessary to restrict certain cryptographic algorithms from use. Just make sure you do your diligence about testing these settings. It is also well worth your time to really understand how the security vulnerability software your company just purchased does it’s testing. A double sided network trace will reveal both sides of the client – server hello and what cryptographic algorithms are being offered from each side over the wire.

Jim “Insert cryptic witticism here” Tierney

Does your logon hang after a password change on win 8.1 /2012 R2/win10?

$
0
0

Hi, Linda Taylor here, Senior Escalation Engineer from the Directory Services team in the UK.

I have been working on this issue which seems to be affecting many of you globally on windows 8.1, 2012 R2 and windows 10, so I thought it would be a good idea to explain the issue and workarounds while we continue to work on a proper fix here.

The symptoms are such that after a password change, logon hangs forever on the welcome screen:

clip_image002

How annoying….

The underlying issue is a deadlock between several components including DPAPI and the redirector.

For full details or the issue, workarounds and related fixes check out my post on the ASKPFEPLAT blog here http://blogs.technet.com/b/askpfeplat/archive/2016/01/11/does-your-win-8-1-2012-r2-win10-logon-hang-after-a-password-change.aspx

This is now fixed in the following updates:

Windows 8.1, 2012 R2, 2012 install:

For Windows 10 TH2 build 1511 install:

I hope this helps,

Linda

Previewing Server 2016 TP4: Temporary Group Memberships

$
0
0

Disclaimer: Windows Server 2016 is still in a Technical Preview state – the information contained in this post may become inaccurate in the future as the product continues to evolve. More specifically, there are still issues being ironed out in other parts of Privileged Access Management in Technical Preview 4 for multi-forest deployments.   Watch for more updates as we get closer to general availability!

Hello, Ryan Ries here again with some juicy new Active Directory hotness. Windows Server 2016 is right around the corner, and it’s bringing a ton of new features and improvements with it. Today we’re going to talk about one of the new things you’ll be seeing in Active Directory, which you might see referred to as “expiring links,” or what I like to call “temporary group memberships.”

One of the challenges that every security-conscious Active Directory administrator has faced is how to deal with contractors, vendors, temporary employees and anyone else who needs temporary access to resources within your Active Directory environment. Let’s pretend that your Information Security team wants to perform an automated vulnerability scan of all the devices on your network, and to do this, they will need a service account with Domain Administrator privileges for 5 business days. Because you are a wise AD administrator, you don’t like the idea of this service account that will be authenticating against every device on the network having Domain Administrator privileges, but the CTO of the company says that you have to give the InfoSec team what they want.

(Trust me, this stuff really happens.)

So you strike a compromise, claiming that you will grant this service account temporary membership in the Domain Admins group for 5 days while the InfoSec team conducts their vulnerability scan. Now you could just manually remove the service account from the group after 5 days, but you are a busy admin and you know you’re going to forget to do that. You could also set up a scheduled task to run after 5 days that runs a script that removes the service account from the Domain Admins group, but let’s explore a couple of more interesting options.

The Old Way

One old-school way of accomplishing this is through the use of dynamic objects in 2003 and later. Dynamic objects are automatically deleted (leaving no tombstone behind) after their entryTTL expires. Using this knowledge, our plan is to create a security group called “Temp DA for InfoSec” as a dynamic object with a TTL (time-to-live) of 5 days. Then we’re going to put the service account into the temporary security group. Then we are going to add the temporary security group to the Domain Admins group. The service account is now a member of Domain Admins because of the nested group membership, and once the temporary security group automatically disappears in 5 days, the nested group membership will be broken and the service account will no longer be a member of Domain Admins.

Creating dynamic objects is not as simple as just right-clicking in AD Users & Computer and selecting “New > Dynamic Object,” but it’s still pretty easy if you use ldifde.exe and a simple text file. Below is an example:

clip_image002
Figure 1: Creating a Dynamic Object with ldifde.exe.

dn: cn=Temp DA For InfoSec,ou=Information Security,dc=adatum,dc=com
changeType: add
objectClass: group
objectClass: dynamicObject
entryTTL: 432000
sAMAccountName: Temp DA For InfoSec

In the text file, just supply the distinguished name of the security group you want to create, and make sure it has both the group objectClass and the dynamicObject objectClass. I set the entryTTL to 432000 in the screen shot above, which is 5 days in seconds. Import the object into AD using the following command:
  ldifde -i -f dynamicGroup.txt

Now if you go look at the newly-created group in AD Users & Computers, you’ll see that it has an entryTTL attribute that is steadily counting down to 0:

clip_image004
Figure 2: Dynamic Security Group with an expiry date.

You can create all sorts of objects as Dynamic Objects by the way, not just groups. But enough about that. We came here to see how the situation has improved in Windows Server 2016. I think you’ll like it better than the somewhat convoluted Dynamic Objects solution I just described.

The New Hotness (Windows Server 2016 Technical Preview 4, version 1511.10586.122)

For our next trick, we’ll need to enable the Privileged Access Management Feature in our Windows Server 2016 forest. Another example of an optional feature is the AD Recycle Bin. Keep in mind that just like the AD Recycle Bin, once you enable the Privileged Access Management feature in your forest, you can’t turn it off. This feature also requires a Windows Server 2016 or “Windows Threshold” forest functional level:

clip_image006
Figure 3: This AD Optional Feature requires a Windows Server 2016 or “Windows Threshold” Forest Functional Level.

It’s easy to enable with PowerShell:
Enable-ADOptionalFeature ‘Privileged Access Management Feature’ -Scope ForestOrConfigurationSet -Target adatum.com

Now that you’ve done this, you can start setting time limits on group memberships directly. It’s so easy:
Add-ADGroupMember -Identity ‘Domain Admins’ -Members ‘InfoSecSvcAcct’ -MemberTimeToLive (New-TimeSpan -Days 5)

Now isn’t that a little easier and more straightforward? Our InfoSec service account now has temporary membership in the Domain Admins group for 5 days. And if you want to view the time remaining in a temporary group membership in real time:
Get-ADGroup ‘Domain Admins’ -Property member -ShowMemberTimeToLive

clip_image008
Figure 4: Viewing the time-to-live on a temporary group membership.

So that’s cool, but in addition to convenience, there is a real security benefit to this feature that we’ve never had before. I’d be remiss not to mention that with the new Privileged Access Management feature, when you add a temporary group membership like this, the domain controller will actually constrain the Kerberos TGT lifetime to the shortest TTL that the user currently has. What that means is that if a user account only has 5 minutes left in its Domain Admins membership when it logs on, the domain controller will give that account a TGT that’s only good for 5 more minutes before it has to be renewed, and when it is renewed, the PAC (privilege attribute certificate) will no longer contain that group membership! You can see this in action using klist.exe:

clip_image010
Figure 5: My Kerberos ticket is only good for about 8 minutes because of my soon-to-expire group membership.

Awesome.

Lastly, it’s worth noting that this is just one small aspect of the upcoming Privileged Access Management feature in Windows Server 2016. There’s much more to it, like shadow security principals, bastion forests, new integrations with Microsoft Identity Manager, and more. Read more about what’s new in Windows Server 2016 here.

Until next time,

Ryan “Domain Admin for a Minute” Ries


Updated 3/21/16 with additional text in Disclaimer – “Disclaimer: Server 2016 is still in a Technical Preview state – the information contained in this post may become inaccurate in the future as the product continues to evolve.  More specifically, there are still issues being ironed out in other parts of Privileged Access Management in Technical Preview 4 for multi-forest deployments.   Watch for more updates as we get closer to general availability!”


Are your DCs too busy to be monitored?: AD Data Collector Set solutions for long report compile times or report data deletion

$
0
0

Hi all, Herbert Mauerer here. In this post we’re back to talk about the built-in AD Diagnostics Data collector set available for Active Directory Performance (ADPERF) issues and how to ensure a useful report is generated when your DCs are under heavy load.

Why are my domain controllers so busy you ask? Consider this: Active Directory stands in the center of the Identity Management for many customers. It stores the configuration information for many critical line of business applications. It houses certificate templates, is used to distribute group policy and is the account database among many other things. All sorts of network-based services use Active Directory for authentication and other services.

As mentioned there are many applications which store their configuration in Active Directory, including the details of the user context relative to the application, plus objects specifically created for the use of these applications.

There are also applications that use Active Directory as a store to synchronize directory data. There are products like Forefront Identity Manager (and now Microsoft Identity Manager) where synchronizing data is the only purpose. I will not discuss whether these applications are meta-directories or virtual directories, or what class our Office 365 DirSync belongs to…

One way or the other, the volume and complexity of Active Directory queries has a constant trend of increasing, and there is no end in sight.

So what are my Domain Controllers doing all day?

We get this questions a lot from our customers. It often seems as if the AD Admins are the last to know what kind of load is put onto the domain controllers by scripts, applications and synchronization engines. And they are not made aware of even significant application changes.

But even small changes can have a drastic effect on the DC performance. DCs are resilient, but even the strongest warrior may fall against an overwhelming force.  Think along the lines of “death by a thousand cuts”.  Consider applications or scripts that run non-optimized or excessive queries on many, many clients during or right after logon and it will feel like a distributed DoS. In this scenario, the domain controller may get bogged down due to the enormous workload issued by the clients. This is one of the classic scenarios when it comes to Domain Controller performance problems.

What resources exist today to help you troubleshoot AD Performance scenarios?

We have already discussed the overall topic in this blog, and today many customer requests start with the complaint that the response times are bad and the LSASS CPU time is high. There also is a blog post specifically on the toolset we’ve had since Windows Server 2008. We also updated and brought back the Server Performance Advisor toolset. This toolset is now more targeted at trend analysis and base-lining.  If a video is more your style, Justin Turner revealed our troubleshooting process at Ignite.

The reports generated by this data collection are hugely useful for understanding what is burdening the Domain Controllers. There are fewer cases where DCs are responding slowly, but there is no significant utilization seen. We released a blog on that scenario and also gave you a simple method to troubleshoot long-running LDAP queries at our sister site.  So what’s new with this post?

The AD Diagnostic Data Collector set report “report.html” is missing or compile time is very slow

In recent months, we have seen an increasing number of customers with incomplete Data Collector Set reports. Most of the time, the “report.html” file is missing:

This is a folder where the creation of the report.html file was successful:

image

This folder has exceeded the limits for reporting:

image

Notice the report.html file is missing in the second folder example. Also take note that the ETL and BLG files are bigger. What’s the reason for this?

The Data Collector Set report generation process uncovered:

  • When the data collection ends, the process “tracerpt.exe” is launched to create a report for the folder where the data was collected.
  • “tracerpt.exe” runs with “below normal” priority so it does not get full CPU attention especially if LSASS is busy as well.
  • “tracerpt.exe” runs with one worker thread only, so it cannot take advantage of more than one CPU core.
  • “tracerpt.exe” accumulates RAM usage as it runs.
  • “tracerpt.exe” has six hours to complete a report. If it is not done within this time, the report is terminated.
  • The default settings of the system AD data collector deletes the biggest data set first that exceed the 1 Gigabyte limit. The biggest single file in the reports is typically “Active Directory.etl”.  The report.html file will not get created if this file does not exist.

I worked with a customer recently with a pretty well-equipped Domain Controller (24 server-class CPUs, 256 GB RAM). The customer was kind enough to run a few tests for various report sizes, and found the following metrics:

  • Until the time-out of six hours is hit, “tracerpt.exe” consumes up to 12 GB of RAM.
  • During this time, one CPU core was allocated 100%. If a DC is in a high-load condition, you may want to increase the base priority of “tracerpt.exe” to get the report to complete. This is at the expense of CPU time potentially impacting purpose of said server and in turn clients.
  • The biggest data set that could be completed within the six hours had an “Active Directory.etl” of 3 GB.

If you have lower-spec and busier machines, you shouldn’t expect the same results as this example (On a lower spec machine with a 3 GB ETL file, the report.html file would likely fail to compile within the 6-hour window).

What a bummer, how do you get Performance Logging done then?

Fortunately, there are a number of parameters for a Data Collector Set that come to the rescue. Before you can use any of them you first need one of the more custom Data Collector Sets. You can play with a variety of settings, based on the purpose of the collection.

In Performance Monitor you can create a custom set on the “User Defined” folder by right-clicking it, to bring up the New -> Data Collector Set option in the context menu:

image

This launches a wizard that prompts you for a number of parameters for the new set.

The first thing it wants is a name for the new set:

image

The next step is to select a template. It may be one of the built-in templates or one exported from another computer as an XML file you select through the “Browse” button. In our case, we want to create a clone of “Active Directory Diagnostics”:

image

The next step is optional, and it’s specifies the storage location for the reports. You may want to select a volume with more space or lower IO load than the default volume:

image

There is one more page in the wizard, but there is no reason to make any more changes here. You can click “Finish” on this page.

The default settings are fine for an idle DC, but if you find your ETL files are too large, your reports are not generated, or it takes too long to process the data, you will likely want to make the following configuration changes.

For a real “Big Data Collector Set” we first want to make important changes to the storage strategy of the set that are available in the “Data Manager” log:

image

The most relevant settings are “Resource Policy” and “Maximum Root Path Size”. I recommend starting with the settings as shown below:

image

Notice, I’ve changed the Resource policy from “Delete largest” to “Delete oldest”. I’ve also increased the Maximum root path size from 1024 to 2048 MB.  You can run some reports to learn what the best size settings are for you. You might very well end up using 10 GB or more for your reports.

The second crucial parameter for your custom sets is the run interval for the data collection. It is five minutes by default. You can adjust that in the properties of the collector in the “Stop Condition” tab. In many cases shortening the data collection is a viable step if you see continuous high load:

image

You should avoid going shorter than two minutes, as this is the maximum LDAP query duration by default. (If you have LDAP queries that reach this threshold, they would not show up in a report that is less than two minutes in length.) In fact, I would suggest the minimum interval be set to three minutes.

One very attractive option is automatically restarting the data collection if a certain size of data collection is exceeded. You need to use common sense when you look at the multiple reports, e.g. the ratio of long-running queries is then shown in the logs. But it is definitely better than no report.

If you expect to exceed the 1 GB limit often, you certainly should adjust the total size of collections (Maximum root path size) in the “Data Manager”.

So how do I know how big the collection is while running it?

You can take a look at the folder of the data collection in Explorer, but you will notice it is pretty lazy updating it with the current size of the collection:

image

Explorer only updates the folder if you are doing something with the files. It sounds strange, but attempting to delete a file will trigger an update:

image

Now that makes more sense…

If you see the log is growing beyond your expectations, you can manually stop it before the stop condition hits the threshold you have configured:

image

Of course, you can also start and stop the reporting from a command line using the logman instructions in this post.

Room for improvement

We are aware there is room for improvement to get bigger data sets reported in a shorter time. The good news is that much of these special configuration changes won’t be needed once your DCs are running on Windows Server 2016. We will talk about that in a future post.

Thanks for reading.

Herbert

Setting up Virtual Smart card logon using Virtual TPM for Windows 10 Hyper-V VM Guests

$
0
0
Hello Everyone, my name is Raghav and I’m a Technical Advisor for one of the Microsoft Active Directory support teams. This is my first blog and today I’ll share with you how to configure a Hyper-V environment in order to enable virtual smart card logon to VM guests by leveraging a new Windows 10 feature: virtual Trusted Platform Module (TPM).

Here’s a quick overview of the terminology discussed in this post:
  • Smart cards are physical authentication devices, which improve on the concept of a password by requiring that users actually have their smart card device with them to access the system, in addition to knowing the PIN, which provides access to the smart card.
  • Virtual smart cards (VSCs) emulate the functionality of traditional smart cards, but instead of requiring the purchase of additional hardware, they utilize technology that users already own and are more likely to have with them at all times. Theoretically, any device that can provide the three key properties of smart cards (non-exportability, isolated cryptography, and anti-hammering) can be commissioned as a VSC, though the Microsoft virtual smart card platform is currently limited to the use of the Trusted Platform Module (TPM) chip onboard most modern computers. This blog will mostly concern TPM virtual smart cards.
    For more information, read Understanding and Evaluating Virtual Smart Cards.
  • Trusted Platform Module - (As Christopher Delay explains in his blog) TPM is a cryptographic device that is attached at the chip level to a PC, Laptop, Tablet, or Mobile Phone. The TPM securely stores measurements of various states of the computer, OS, and applications. These measurements are used to ensure the integrity of the system and software running on that system. The TPM can also be used to generate and store cryptographic keys. Additionally, cryptographic operations using these keys take place on the TPM preventing the private keys of certificates from being accessed outside the TPM.
  • Virtualization-based security – The following Information is taken directly from https://technet.microsoft.com/en-us/itpro/windows/keep-secure/windows-10-security-guide
    • One of the most powerful changes to Windows 10 is virtual-based security. Virtual-based security (VBS) takes advantage of advances in PC virtualization to change the game when it comes to protecting system components from compromise. VBS is able to isolate some of the most sensitive security components of Windows 10. These security components aren’t just isolated through application programming interface (API) restrictions or a middle-layer: They actually run in a different virtual environment and are isolated from the Windows 10 operating system itself.
    • VBS and the isolation it provides is accomplished through the novel use of the Hyper V hypervisor. In this case, instead of running other operating systems on top of the hypervisor as virtual guests, the hypervisor supports running the VBS environment in parallel with Windows and enforces a tightly limited set of interactions and access between the environments. Think of the VBS environment as a miniature operating system: It has its own kernel and processes. Unlike Windows, however, the VBS environment runs a micro-kernel and only two processes called trustlets
  • Local Security Authority (LSA) enforces Windows authentication and authorization policies. LSA is a well-known security component that has been part of Windows since 1993. Sensitive portions of LSA are isolated within the VBS environment and are protected by a new feature called Credential Guard.
  • Hypervisor-enforced code integrity verifies the integrity of kernel-mode code prior to execution. This is a part of the Device Guard feature.
VBS provides two major improvements in Windows 10 security: a new trust boundary between key Windows system components and a secure execution environment within which they run. A trust boundary between key Windows system components is enabled though the VBS environment’s use of platform virtualization to isolate the VBS environment from the Windows operating system. Running the VBS environment and Windows operating system as guests on top of Hyper-V and the processor’s virtualization extensions inherently prevents the guests from interacting with each other outside the limited and highly structured communication channels between the trustlets within the VBS environment and Windows operating system.
VBS acts as a secure execution environment because the architecture inherently prevents processes that run within the Windows environment – even those that have full system privileges – from accessing the kernel, trustlets, or any allocated memory within the VBS environment. In addition, the VBS environment uses TPM 2.0 to protect any data that is persisted to disk. Similarly, a user who has access to the physical disk is unable to access the data in an unencrypted form.
clip_image002[4]
VBS requires a system that includes:
  • Windows 10 Enterprise Edition
  • A-64-bit processor
  • UEFI with Secure Boot
  • Second-Level Address Translation (SLAT) technologies (for example, Intel Extended Page Tables [EPT], AMD Rapid Virtualization Indexing [RVI])
  • Virtualization extensions (for example, Intel VT-x, AMD RVI)
  • I/O memory management unit (IOMMU) chipset virtualization (Intel VT-d or AMD-Vi)
  • TPM 2.0
Note: TPM 1.2 and 2.0 provides protection for encryption keys that are stored in the firmware. TPM 1.2 is not supported on Windows 10 RTM (Build 10240); however, it is supported in Windows 10, Version 1511 (Build 10586) and later.
Among other functions, Windows 10 uses the TPM to protect the encryption keys for BitLocker volumes, virtual smart cards, certificates, and the many other keys that the TPM is used to generate. Windows 10 also uses the TPM to securely record and protect integrity-related measurements of select hardware.
Now that we have the terminology clarified, let’s talk about how to set this up.

Setting up Virtual TPM
First we will ensure we meet the basic requirements on the Hyper-V host.
On the Hyper-V host, launch msinfo32 and confirm the following values:

The BIOS Mode should state “UEFI”.

clip_image001
Secure Boot State should be On.
clip_image002
Next, we will enable VBS on the Hyper-V host.
  1. Open up the Local Group Policy Editor by running gpedit.msc.
  2. Navigate to the following settings: Computer Configuration, Administrative Templates, System, Device Guard. Double-click Turn On Virtualization Based Security. Set the policy to Enabled, click OK,
clip_image004
Now we will enable Isolated User Mode on the Hyper-V host.
1. To do that, go to run type appwiz.cpl on the left pane find Turn Windows Features on or off.
Check Isolated User Mode, click OK, and then reboot when prompted.
clip_image006
This completes the initial steps needed for the Hyper-V host.
Now we will enable support for virtual TPM on your Hyper-V VM guest
Note: Support for Virtual TPM is only included in Generation 2 VMs running Windows 10.
To enable this on your Windows 10 generation 2 VM. Open up the VM settings and review the configuration under the Hardware, Security section. Enable Secure Boot and Enable Trusted Platform Module should both be selected.
clip_image008
That completes the Virtual TPM part of the configuration.  We will now work on working on virtual Smart Card configuration.
Setting up Virtual Smart Card
In the next section, we create a certificate template so that we can request a certificate that has the required parameters needed for Virtual Smart Card logon.
These steps are adapted from the following TechNet article: https://technet.microsoft.com/en-us/library/dn579260.aspx
Prerequisites and Configuration for Certificate Authority (CA) and domain controllers
  • Active Directory Domain Services
  • Domain controllers must be configured with a domain controller certificate to authenticate smartcard users. The following article covers Guidelines for enabling smart card logon: http://support.microsoft.com/kb/281245
  • An Enterprise Certification Authority running on Windows Server 2012 or Windows Server 2012 R2. Again, Chris’s blog covers neatly on how to setup a PKI environment.
  • Active Directory must have the issuing CA in the NTAuth store to authenticate users to active directory.
Create the certificate template
1. On the CA console (certsrv.msc) right click on Certificate Template and select Manage
clip_image010
2. Right-click the Smartcard Logon template and then click Duplicate Template
clip_image012
3. On the Compatibility tab, set the compatibility settings as below
clip_image014
4. On the Request Handling tab, in the Purpose section, select Signature and smartcard logon from the drop down menu
clip_image016
5. On the Cryptography Tab, select the Requests must use on of the following providers radio button and then select the Microsoft Base Smart Card Crypto Provider option.
clip_image018
Optionally, you can use a Key Storage Provider (KSP). Choose the KSP, under Provider Category select Key Storage Provider. Then select the Requests must use one of the following providers radio button and select the Microsoft Smart Card Key Storage Provider option.
clip_image020
6. On the General tab: Specify a name, such as TPM Virtual Smart Card Logon. Set the validity period to the desired value and choose OK

7. Navigate to Certificate Templates. Right click on Certificate Templates and select New, then Certificate Template to Issue.  Select the new template you created in the prior steps.

clip_image022
Note that it usually takes some time for this certificate to become available for issuance.

Create the TPM virtual smart card

Next we’ll create a virtual Smart Card on the Virtual Machine by using the Tpmvscmgr.exe command-line tool.

1. On the Windows 10 Gen 2 Hyper-V VM guest, open an Administrative Command Prompt and run the following command:
tpmvsmgr.exe create /name myVSC /pin default /adminkey random /generate
clip_image024
You will be prompted for a pin.  Enter at least eight characters and confirm the entry.  (You will need this pin in later steps)

Enroll for the certificate on the Virtual Smart Card Certificate on Virtual Machine.
1. In certmgr.msc, right click Certificates, click All Tasks then Request New Certificate.
clip_image025
2. On the certificate enrollment select the new template you created earlier.
clip_image027
3. It will prompt for the PIN associated with the Virtual Smart Card. Enter the PIN and click OK.
clip_image029
4. If the request completes successfully, it will display Certificate Installation results page
clip_image031
5. On the virtual machine select sign-in options and select security device and enter the pin
clip_image033
That completes the steps on how to deploy Virtual Smart Cards using a virtual TPM on virtual machines.  Thanks for reading!

Raghav Mahajan

The Version Store Called, and They’re All Out of Buckets

$
0
0

Hello, Ryan Ries back at it again with another exciting installment of esoteric Active Directory and ESE database details!

I think we need to have another little chat about something called the version store.

The version store is an inherent mechanism of the Extensible Storage Engine and a commonly seen concept among databases in general. (ESE is sometimes referred to as Jet Blue. Sometimes old codenames are so catchy that they just won’t die.) Therefore, the following information should be relevant to any application or service that uses an ESE database (such as Exchange,) but today I’m specifically focused on its usage as it pertains to Active Directory.

The version store is one of those details that the majority of customers will never need to think about. The stock configuration of the version store for Active Directory will be sufficient to handle any situation encountered by 99% of AD administrators. But for that 1% out there with exceptionally large and/or busy Active Directory deployments, (or for those who make “interesting” administrative choices,) the monitoring and tuning of the version store can become a very important topic. And quite suddenly too, as replication throughout your environment grinds to a halt because of version store exhaustion and you scramble to figure out why.

The purpose of this blog post is to provide up-to-date (as of the year 2016) information and guidance on the version store, and to do it in a format that may be more palatable to many readers than sifting through reams of old MSDN and TechNet documentation that may or may not be accurate or up to date. I can also offer more practical examples than you would probably get from straight technical documentation. There has been quite an uptick lately in the number of cases we’re seeing here in Support that center around version store exhaustion. While the job security for us is nice, knowing this stuff ahead of time can save you from having to call us and spend lots of costly support hours.

Version Store: What is it?

As mentioned earlier, the version store is an integral part of the ESE database engine. It’s an area of temporary storage in memory that holds copies of objects that are in the process of being modified, for the sake of providing atomic transactions. This allows the database to roll back transactions in case it can’t commit them, and it allows other threads to read from a copy of the data while it’s in the process of being modified. All applications and services that utilize an ESE database use version store to some extent. The article “How the Data Store Works” describes it well:

“ESE provides transactional views of the database. The cost of providing these views is that any object that is modified in a transaction has to be temporarily copied so that two views of the object can be provided: one to the thread inside that transaction and one to threads in other transactions. This copy must remain as long as any two transactions in the process have different views of the object. The repository that holds these temporary copies is called the version store. Because the version store requires contiguous virtual address space, it has a size limit. If a transaction is open for a long time while changes are being made (either in that transaction or in others), eventually the version store can be exhausted. At this point, no further database updates are possible.”

When Active Directory was first introduced, it was deployed on machines with a single x86 processor with less than 4 GB of RAM supporting NTDS.DIT files that ranged between 2MB and a few hundred MB. Most of the documentation you’ll find on the internet regarding the version store still has its roots in that era and was written with the aforementioned hardware in mind. Today, things like hardware refreshes, OS version upgrades, cloud adoption and an improved understanding of AD architecture are driving massive consolidation in the number of forests, domains and domain controllers in them, DIT sizes are getting bigger… all while still relying on default configuration values from the Windows 2000 era.

The number-one killer of version store is long-running transactions. Transactions that tend to be long-running include, but are not limited to:

– Deleting a group with 100,000 members
– Deleting any object, not just a group, with 100,000 or more forward/back links to clean
– Modifying ACLs in Active Directory on a parent container that propagate down to many thousands of inheriting child objects
– Creating new database indices
– Having underpowered or overtaxed domain controllers, causing transactions to take longer in general
– Anything that requires boat-loads of database modification
– Large SDProp and garbage collection tasks
– Any combination thereof

I will show some examples of the errors that you would see in your event logs when you experience version store exhaustion in the next section.

Monitoring Version Store Usage

To monitor version store usage, leverage the Performance Monitor (perfmon) counter:

‘\\dc01\Database ==> Instances(lsass/NTDSA)\Version buckets allocated’

image
(Figure 1: The ‘Version buckets allocated’ perfmon counter.)

The version store divides the amount of memory that it has been given into “buckets,” or “pages.” Version store pages need not (and in AD, they do not) equal the size of database pages elsewhere in the database. We’ll get into the exact size of these buckets in a minute.

During typical operation, when the database is not busy, this counter will be low. It may even be zero if the database really just isn’t doing anything. But when you perform one of those actions that I mentioned above that qualify as “long-running transactions,” you will trigger a spike in the version store usage. Here is an example of me deleting a group that contains 200,000 members, on a DC running 2012 R2 with 1 64bit CPU:

image(Figure 2: Deleting a group containing 200k members on a 2012 R2 DC with 1 64bit CPU.)

The version store spikes to 5332 buckets allocated here, seconds after I deleted the group, but as long as the DC recovers and falls back down to nominal levels, you’ll be alright. If it stays high or even maxed out for extended periods of time, then no more database transactions for you. This includes no more replication. This is just an example using the common member/memberOf relationship, but any linked-value attribute relationship can cause this behavior. (I’ve talked a little about linked value attributes before here.) There are plenty of other types of objects that may invoke this same kind of behavior, such as deleting an RODC computer object, and then its msDs-RevealedUsers links must be processed, etc..

I’m not saying that deleting a group with fewer than 200K members couldn’t also trigger version store exhaustion if there are other transactions taking place on your domain controller simultaneously or other extenuating circumstances. I’ve seen transactions involving as few as 70K linked values cause major problems.

After you delete an object in AD, and the domain controller turns it into a tombstone, each domain controller has to process the linked-value attributes of that object to maintain the referential integrity of the database. It does this in “batches,” usually 1000 or 10,000 depending on Windows version and configuration. This was only very recently documented here. Since each “batch” of 1000 or 10,000 is considered a single transaction, a smaller batch size will tend to complete faster and thus require less version store usage. (But the overall job will take longer.)

An interesting curveball here is that having the AD Recycle Bin enabled will defer this action by an msDs-DeletedObjectLifetime number of days after an object is deleted, since that’s the appeal behind the AD Recycle Bin – it allows you to easily restore deleted objects with all their links intact. (More detail on the AD Recycle Bin here.)

When you run out of version storage, no other database transactions can be committed until the transaction or transactions that are causing the version store exhaustion are completed or rolled back. At this point, most people start rebooting their domain controllers, and this may or may not resolve the immediate issue for them depending on exactly what’s going on. Another thing that may alleviate this issue is offline defragmentation of the database. (Or reducing the links batch size, or increasing the version store size – more on that later.) Again, we’re usually looking at 100+ gigabyte DITs when we see this kind of issue, so we’re essentially talking about pushing the limits of AD. And we’re also talking about hours of downtime for a domain controller while we do that offline defrag and semantic database analysis.

Here, Active Directory is completely tapping out the version store. Notice the plateau once it has reached its max:

image(Figure 3: Version store being maxed out at 13078 buckets on a 2012 R2 DC with 1 64bit CPU.)

So it has maxed out at 13,078 buckets.

When you hit this wall, you will see events such as these in your event logs:

Log Name: Directory Service
Source: Microsoft-Windows-ActiveDirectory_DomainService
Date: 5/16/2016 5:54:52 PM
Event ID: 1519
Task Category: Internal Processing
Level: Error
Keywords: Classic
User: S-1-5-21-4276753195-2149800008-4148487879-500
Computer: DC01.contoso.com
Description:
Internal Error: Active Directory Domain Services could not perform an operation because the database has run out of version storage.

And also:

Log Name: Directory Service
Source: NTDS ISAM
Date: 5/16/2016 5:54:52 PM
Event ID: 623
Task Category: (14)
Level: Error
Keywords: Classic
User: N/A
Computer: DC01.contoso.com
Description:
NTDS (480) NTDSA: The version store for this instance (0) has reached its maximum size of 408Mb. It is likely that a long-running transaction is preventing cleanup of the version store and causing it to build up in size. Updates will be rejected until the long-running transaction has been completely committed or rolled back.

The peculiar “408Mb” figure that comes along with that last event leads us into the next section…

How big is the Version Store by default?

The “How the Data Store Works” article that I linked to earlier says:

“The version store has a size limit that is the lesser of the following: one-fourth of total random access memory (RAM) or 100 MB. Because most domain controllers have more than 400 MB of RAM, the most common version store size is the maximum size of 100 MB.”

Incorrect.

And then you have other articles that have even gone to print, such as this one, that say:

“Typically, the version store is 25 percent of the physical RAM.”

Extremely incorrect.

What about my earlier question about the bucket size? Well if you consulted this KB article you would read:

The value for the setting is the number of 16KB memory chunks that will be reserved.”

Nope, that’s wrong.

Or if I go to the MSDN documentation for ESE:

“JET_paramMaxVerPages
This parameter reserves the requested number of version store pages for use by an instance.

Each version store page as configured by this parameter is 16KB in size.”

Not true.

The pages are not 16KB anymore on 64bit DCs. And the only time that the “100MB” figure was ever even close to accurate was when domain controllers were 32bit and had 1 CPU. But today, domain controllers are 64bit and have lots of CPUs. Both version store bucket size and number of version store buckets allocated by default both double based on whether your domain controller is 32bit or 64bit. And the figure also scales a little bit based on how many CPUs are in your domain controller.

So without further ado, here is how to calculate the actual number of buckets that Active Directory will allocate by default:

(2 * (3 * (15 + 4 + 4 * #CPUs)) + 6400) * PointerSize / 4

Pointer size is 4 if you’re using a 32bit processor, and 8 if you’re using a 64bit processor.

And secondly, version store pages are 16KB if you’re on a 32bit processor, and 32KB if you’re on a 64bit processor. So using a 64bit processor effectively quadruples the default size of your AD version store. To convert number of buckets allocated into bytes for a 32bit processor:

(((2 * (3 * (15 + 4 + 4 * 1)) + 6400) * 4 / 4) * 16KB) / 1MB

And for a 64bit processor:

(((2 * (3 * (15 + 4 + 4 * 1)) + 6400) * 8 / 4) * 32KB) / 1MB

So using the above formulae, the version store size for a single-core, 64bit DC would be ~408MB, which matches that event ID 623 we got from ESE earlier. It also conveniently matches 13078 * 32KB buckets, which is where we plateaued with our perfmon counter earlier.

If you had a 4-core, 64bit domain controller, the formula would come out to ~412MB, and you will see this line up with the event log event ID 623 on that machine. When a 4-core, Windows 2008 R2 domain controller with default configuration runs out of version store:

Log Name:      Directory Service
Source:        NTDS ISAM
Date:          5/15/2016 1:18:25 PM
Event ID:      623
Task Category: (14)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      dc02.fabrikam.com
Description:
NTDS (476) NTDSA: The version store for this instance (0) has reached its maximum size of 412Mb. It is likely that a long-running transaction is preventing cleanup of the version store and causing it to build up in size. Updates will be rejected until the long-running transaction has been completely committed or rolled back.

The version store size for a single-core, 32bit DC is ~102MB. This must be where the original “100MB” adage came from. But as you can see now, that information is woefully outdated.

The 6400 number in the equation comes from the fact that 6400 is the absolute, hard-coded minimum number of version store pages/buckets that AD will give you. Turns out that’s about 100MB, if you assumed 16KB pages, or 200MB if you assume 32KB pages. The interesting side-effect from this is that the documented “EDB max ver pages (increment over the minimum)” registry entry, which is the supported way of increasing your version store size, doesn’t actually have any effect unless you set it to some value greater than 6400 decimal. If you set that registry key to something less than 6400, then it will just get overridden to 6400 when AD starts. But if you set that registry entry to, say, 9600 decimal, then your version store size calculation will be:

(((2 *(3 * (15 + 4 + 4 * 1)) + 9600) * 8 / 4) * 32KB) / 1MB = 608.6MB

For a 64bit, 1-core domain controller.

So let’s set those values on a DC, then run up the version store, and let’s get empirical up in here:

image(Figure 4: Version store exhaustion at 19478 buckets on a 2012 R2 DC with 1 64bit CPU.)

(19478 * 32KB) / 1MB = 608.7MB

And wouldn’t you know it, the event log now reads:

image(Figure 5: The event log from the previous version store exhaustion, showing the effect of setting the “EDB max ver pages (increment over the minimum)” registry value to 9600.)

Here’s a table that shows version store sizes based on the “EDB max ver pages (increment over the minimum)” value and common CPU counts:

Buckets

1 CPU

2 CPUs

4 CPUs

8 CPUs

16 CPUs

6400

(The default)

x64: 410 MB

x86: 103 MB

x64: 412 MB

x86: 103 MB

x64: 415 MB

x86: 104 MB

x64: 421 MB

x86: 105 MB

x64: 433 MB

x86: 108 MB

9600

x64: 608 MB

x86: 152 MB

x64: 610 MB

x86: 153 MB

x64: 613 MB

x86: 153 MB

x64: 619MB

x86: 155 MB

x64: 631 MB

x86: 158 MB

12800

x64: 808 MB

x86: 202 MB

x64: 810 MB

x86: 203 MB

x64: 813 MB

x86: 203 MB

x64: 819 MB

x86: 205 MB

x64: 831 MB

x86: 208 MB

16000

x64: 1008 MB

x86: 252 MB

x64: 1010 MB

x86: 253 MB

x64: 1013 MB

x86: 253 MB

x64: 1019 MB

x86: 255 MB

x64: 1031 MB

x86: 258 MB

19200

x64: 1208 MB

x86: 302 MB

x64: 1210 MB

x86: 303 MB

x64: 1213 MB

x86: 303 MB

x64: 1219 MB

x86: 305 MB

x64: 1231 MB

x86: 308 MB

Sorry for the slight rounding errors – I just didn’t want to deal with decimals. As you can see, the number of CPUs in your domain controller only has a slight effect on the version store size. The processor architecture, however, makes all the difference. Good thing absolutely no one uses x86 DCs anymore, right?

Now I want to add a final word of caution.

I want to make it clear that we recommend changing the “EDB max ver pages (increment over the minimum)” only when necessary; when the event ID 623s start appearing. (If it ain’t broke, don’t fix it.) I also want to reiterate the warnings that appear on the support KB, that you must not set this value arbitrarily high, you should increment this setting in small (50MB or 100MB increments,) and that if setting the value to 19200 buckets still does not resolve your issue, then you should contact Microsoft Support. If you are going to change this value, it is advisable to change it consistently across all domain controllers, but you must also carefully consider the processor architecture and available memory on each DC before you change this setting. The version store requires a contiguous allocation of memory – precious real-estate – and raising the value too high can prevent lsass from being able to perform other work. Once the problem has subsided, you should then return this setting back to its default value.

In my next post on this topic, I plan on going into more detail on how one might actually troubleshoot the issue and track down the reason behind why the version store exhaustion is happening.

Conclusions

There is a lot of old documentation out there that has misled many an AD administrator on this topic. It was essentially accurate at the time it was written, but AD has evolved since then. I hope that with this post I was able to shed more light on the topic than you probably ever thought was necessary. It’s an undeniable truth that more and more of our customers continue to push the limits of AD beyond that which was originally conceived. I also want to remind the reader that the majority of the information in this article is AD-specific. If you’re thinking about Exchange or Certificate Services or Windows Update or DFSR or anything else that uses an ESE database, then you need to go figure out your own application-specific details, because we don’t use the same page sizes or algorithms as those guys.

I hope this will be valuable to those who find themselves asking questions about the ESE version store in Active Directory.

With love,

Ryan “Buckets of Fun” Ries


Deploying Group Policy Security Update MS16-072 \ KB3163622

$
0
0

My name is Ajay Sarkaria & I work with the Windows Supportability team at Microsoft. There have been many questions on deploying the newly released security update MS16-072.

This post was written to provide guidance and answer questions needed by administrators to deploy the newly released security update, MS16-072 that addresses a vulnerability. The vulnerability could allow elevation of privilege if an attacker launches a man-in-the-middle (MiTM) attack against the traffic passing between a domain controller and the target machine on domain-joined Windows computers.

The table below summarizes the KB article number for the relevant Operating System:

Article # Title Context / Synopsis
MSKB 3163622 MS16-072: Security Updates for Group Policy: June 14, 2016 Main article for MS16-072
MSKB 3159398 MS16-072: Description of the security update for Group Policy: June 14, 2016 MS16-072 for Windows Vista / Windows Server 2008, Window 7 / Windows Server 2008 R2, Windows Server 2012, Window 8.1 / Windows Server 2012 R2
MSKB 3163017 Cumulative update for Windows 10: June 14, 2016 MS16-072 For Windows 10 RTM
MSKB 3163018 Cumulative update for Windows 10 Version 1511 and Windows Server 2016 Technical Preview 4: June 14, 2016 MS16-072 For Windows 10 1511 + Windows Server 2016 TP4
MSKB 3163016 Cumulative Update for Windows Server 2016 Technical Preview 5: June 14 2016 MS16-072 For Windows Server 2016 TP5
TN: MS16-072 Microsoft Security Bulletin MS16-072 – Important Overview of changes in MS16-072
What does this security update change?

The most important aspect of this security update is to understand the behavior changes affecting the way User Group Policy is applied on a Windows computer. MS16-072 changes the security context with which user group policies are retrieved. Traditionally, when a user group policy is retrieved, it is processed using the user’s security context.

After MS16-072 is installed, user group policies are retrieved by using the computer’s security context. This by-design behavior change protects domain joined computers from a security vulnerability.

When a user group policy is retrieved using the computer’s security context, the computer account will now need “read” access to retrieve the group policy objects (GPOs) needed to apply to the user.

Traditionally, all group policies were read if the “user” had read access either directly or being part of a domain group e.g. Authenticated Users

What do we need to check before deploying this security update?

As discussed above, by default “Authenticated Users” have “Read” and “Apply Group Policy” on all Group Policy Objects in an Active Directory Domain.

Below is a screenshot from the Default Domain Policy:

If permissions on any of the Group Policy Objects in your active Directory domain have not been modified, are using the defaults, and as long as Kerberos authentication is working fine in your Active Directory forest (i.e. there are not Kerberos errors visible in the system event log on client computers while accessing domain resources), there is nothing else you need to make sure before you deploy the security update.

In some deployments, administrators may have removed the “Authenticated Users” group from some or all Group Policy Objects (Security filtering, etc.)

In such cases, you will need to make sure of the following before you deploy the security update:

  1. Check if “Authenticated Users” group read permissions were removed intentionally by the admins. If not, then you should probably add those back. For example, if you do not use any security filtering to target specific group policies to a set of users, you could add “Authenticated Users” back with the default permissions as shown in the example screenshot above.
  2. If the “Authenticated Users” permissions were removed intentionally (security filtering, etc), then as a result of the by-design change in this security update (i.e. to now use the computer’s security context to retrieve user policies), you will need to add the computer account retrieving the group policy object (GPO) to “Read” Group Policy (and not “Apply group policy“).

    Example Screenshot:

In the above example screenshot, let’s say an Administrator wants “User-Policy” (Name of the Group Policy Object) to only apply to the user with name “MSFT Ajay” and not to any other user, then the above is how the Group Policy would have been filtered for other users. “Authenticated Users” has been removed intentionally in the above example scenario.

Notice that no other user or group is included to have “Read” or “Apply Group Policy” permissions other than the default Domain Admins and Enterprise Admins. These groups do not have “Apply Group Policy” by default so the GPO would not apply to the users of these groups & apply only to user “MSFT Ajay”

What will happen if there are Group Policy Objects (GPOs) in an Active Directory domain that are using security filtering as discussed in the example scenario above?

Symptoms when you have security filtering Group Policy Objects (GPOs) like the above example and you install the security update MS16-072:

  • Printers or mapped drives assigned through Group Policy Preferences disappear.
  • Shortcuts to applications on users’ desktop are missing
  • Security filtering group policy does not process anymore
  • You may see the following change in gpresult: Filtering: Not Applied (Unknown Reason)
What is the Resolution?

Simply adding the “Authenticated Users” group with the “Read” permissions on the Group Policy Objects (GPOs) should be sufficient. Domain Computers are part of the “Authenticated Users” group. “Authenticated Users” have these permissions on any new Group Policy Objects (GPOs) by default. Again, the guidance is to add just “Read” permissions and not “Apply Group Policy” for “Authenticated Users”

What if adding Authenticated Users with Read permissions is not an option?

If adding “Authenticated Users” with just “Read” permissions is not an option in your environment, then you will need to add the “Domain Computers” group with “Read” Permissions. If you want to limit it beyond the Domain Computers group: Administrators can also create a new domain group and add the computer accounts to the group so you can limit the “Read Access” on a Group Policy Object (GPO). However, computers will not pick up membership of the new group until a reboot. Also keep in mind that with this security update installed, this additional step is only required if the default “Authenticated Users” Group has been removed from the policy where user settings are applied.

Example Screenshots:

Now in the above scenario, after you install the security update, as the user group policy needs to be retrieved using the system’s security context, (domain joined system being part of the “Domain Computers” security group by default), the client computer will be able to retrieve the user policies required to be applied to the user and the same will be processed successfully.

How to identify GPOs with issues:

In case you have already installed the security update and need to identify Group Policy Objects (GPOs) that are affected, the easy way is just to do a simple gpupdate /force on a Windows client computer and then run the gpresult /h new-report.html -> Open the new-report.html and review for any errors like: “Reason Denied: Inaccessible, Empty or Disabled”

What if there are lot of GPOs?

A sample script is available to check if your Group Policy Objects (GPOs) have permissions missing:

MS16-072 – Known Issue – Use PowerShell to Check GPOs:

https://blogs.technet.microsoft.com/poshchap/2016/06/16/ms16-072-known-issue-use-powershell-to-check-gpos/

Below are some Frequently asked Questions we have seen:

Frequently Asked Questions (FAQs):

Q1) Do I need to install the fix on only client OS? OR do I also need to install it on the Server OS?

A1) It is recommended you patch Windows and Windows Server computers which are running Windows Vista, Windows Server 2008 and newer Operating Systems (OS), regardless of SKU or role, in your entire domain environment. These updates only change behavior from a client (as in “client-server distributed system architecture”) standpoint, but all computers in a domain are “clients” to SYSVOL and Group Policy; even the Domain Controllers (DCs) themselves

Q2) Do I need to enable any registry settings to enable the security update?

A2) No, this security update will be enabled when you install the MS16-072 security update, however you need to check the permissions on your Group Policy Objects (GPOs) as explained above

Q3) What will change in regard to how group policy processing works after the security update is installed?

A3) To retrieve user policy, the connection to the Windows domain controller (DC) prior to the installation of MS16-072 is done under the user’s security context. With this security update installed, instead of user’s security context, Windows group policy clients will now force local system’s security context, therefore forcing Kerberos authentication

Q4) We already have the security update MS15-011 & MS15-014 installed which hardens the UNC paths for SYSVOL & NETLOGON & have the following registry keys being pushed using group policy:

  • RequirePrivacy=1
  • RequireMutualAuthentication=1
  • RequireIntegrity=1

Should the UNC Hardening security update with the above registry settings not take care of this vulnerability when processing group policy from the SYSVOL?

A4) No. UNC Hardening alone will not protect against this vulnerability. In order to protect against this vulnerability, one of the following scenarios must apply: UNC Hardened access is enabled for SYSVOL/NETLOGON as suggested, and the client computer is configured to require Kerberos FAST Armoring

– OR –

UNC Hardened Access is enabled for SYSVOL/NETLOGON, and this particular security update (MS16-072 \ KB3163622) is installed

Q5) If we have security filtering on Computer objects, what change may be needed after we install the security update?

A5) Nothing will change in regard to how Computer Group Policy retrieval and processing works

Q6) We are using security filtering for user objects and after installing the update, group policy processing is not working anymore

A6) As noted above, the security update changes the way user group policy settings are retrieved. The reason for group policy processing failing after the update is installed is because you may have removed the default “Authenticated Users” group from the Group Policy Object (GPO). The computer account will now need “read” permissions on the Group Policy Object (GPO). You can add “Domain Computers” group with “Read” permissions on the Group Policy Object (GPO) to be able to retrieve the list of GPOs to download for the user

Example Screenshot as below:

Q7) Will installing this security update impact cross forest user group policy processing?

A7) No, this security update will not impact cross forest user group policy processing. When a user from one forest logs onto a computer in another forest and the group policy setting “Allow Cross-Forest User Policy and Roaming User Profiles” is enabled, the user group policy during the cross forest logon will be retrieved using the user’s security context.

Q8) Is there a need to specifically add “Domain Computers” to make user group policy processing work or adding “Authenticated Users” with just read permissions should suffice?

A8) Yes, just adding “Authenticated Users” with Read permissions should suffice. If you already have “Authenticated Users” added with at-least read permissions on a GPO, there is no further action required. “Domain Computers” are by default part of the “Authenticated Users” group & user group policy processing will continue to work. You only need to add “Domain Computers” to the GPO with read permissions if you do not want to add “Authenticated Users” to have “Read”

Thanks,

Ajay Sarkaria

Supportability Program Manager – Windows

Access-Based Enumeration (ABE) Concepts (part 1 of 2)

$
0
0

Hello everyone, Hubert from the German Networking Team here.  Today I want to revisit a topic that I wrote about in 2009: Access-Based Enumeration (ABE)

This is the first part of a 2-part Series. This first part will explain some conceptual things around ABE.  The second part will focus on diagnostic and troubleshooting of ABE related problems.  The second post is here.

Access-Based Enumeration has existed since Windows Server 2003 SP1 and has not change in any significant form since my Blog post in 2009. However, what has significantly changed is its popularity.

With its integration into V2 (2008 Mode) DFS Namespaces and the increasing demand for data privacy, it became a tool of choice for many architects. However, the same strict limitations and performance impact it had in Windows Server 2003 still apply today. With this post, I hope to shed some more light here as these limitations and the performance impact are either unknown or often ignored. Read on to gain a little insight and background on ABE so that you:

  1. Understand its capabilities and limitations
  2. Gain the background knowledge needed for my next post on how to troubleshoot ABE

Two things to keep in mind:

  • ABE is not a security feature (it’s more of a convenience feature)
  • There is no guarantee that ABE will perform well under all circumstances. If performance issues come up in your deployment, disabling ABE is a valid solution.

So without any further ado let’s jump right in:

What is ABE and what can I do with it?

From the TechNet topic:

“Access-based enumeration displays only the files and folders that a user has permissions to access. If a user does not have Read (or equivalent) permissions for a folder, Windows hides the folder from the user’s view. This feature is active only when viewing files and folders in a shared folder; it is not active when viewing files and folders in the local file system.”

Note that ABE has to check the user’s permissions at the time of enumeration and filter out files and folders they don’t have Read permissions to. Also note that this filtering only applies if the user is attempting to access the share via SMB versus simply browsing the same folder structure in the local file system.

For example, let’s assume you have an ABE enabled file server share with 500 files and folders, but a certain user only has read permissions to 5 of those folders. The user is only able to view 5 folders when accessing the share over the network. If the user logons to this server and browses the local file system, they will see all of the files and folders.

In addition to file server shares, ABE can also be used to filter the links in DFS Namespaces.

With V2 Namespaces DFSN got the capability to store permissions for each DFSN link, and apply those permissions to the local file system of each DFSN Server.

Those NTFS permissions are then used by ABE to filter directory enumerations against the DFSN root share thus removing DFSN links from the results sent to the client.

Therefore, ABE can be used to either hide sensitive information in the link/folder names, or to increase usability by hiding hundreds of links/folders the user does not have access to.

How does it work?

The filtering happens on the file server at the time of the request.

Any Object (File / Folder / Shortcut / Reparse Point / etc.) where the user has less than generic read permissions is omitted in the response by the server.

Generic Read means:

  • List Folder / Read Data
  • Read Attributes
  • Read Extended Attributes
  • Read Permissions

If you take any of these permissions away, ABE will hide the object.

So you could create a scenario (i.e. remove the Read Permission permission) where the object is hidden from the user, but he/she could still open/read the file or folder if the user knows its name.

That brings us to the next important conceptual point we need to understand:

ABE does not do access control.

It only filters the response to a Directory Enumeration. The access control is still done through NTFS.

Aside from that ABE only works when the access happens through the Server Service (aka the Fileserver). Any access locally to the file system is not affected by ABE. Restated:

“Access-based enumeration does not prevent users from obtaining a referral to a folder target if they already know the DFS path of the folder with targets. Permissions set using Windows Explorer or the Icacls command on namespace roots or folders without targets control whether users can access the DFS folder or namespace root. However, they do not prevent users from directly accessing a folder with targets. Only the share permissions or the NTFS file system permissions of the shared folder itself can prevent users from accessing folder targets.” Recall what I said earlier, “ABE is not a security feature”. TechNet

ABE does not do any caching.

Every requests causes a filter calculation. There is no cache. ABE will repeat the same exact work for identical directory enumerations by the same user.

ABE cannot predict the permissions or the result.

It has to do the calculations for each object in every level of your folder hierarchy every time it is accessed.

If you use inheritance on the folder structure, a user will have the same permission and thus the same filter result from ABE through the entire folder structure. Still ABE as to calculate this result, consuming CPU Cycles in the process.

If you enable ABE on such a folder structure you are just wasting CPU cycles without any gain.

With those basics out of the way, let’s dive into the mechanics behind the scenes:

How the filtering calculation works

  1. When a QUERY_DIRECTORY request (https://msdn.microsoft.com/en-us/library/cc246551.aspx) or its SMB1 equivalent arrives at the server, the server will get a list of objects within that directory from the filesystem.
  2. With ABE enabled, this list is not immediately sent out to the client, but instead passed over to the ABE for processing.
  3. ABE will iterate through EVERY object of this list and compare the permission of the user with the objects ACL.
  4. The objects where the user does not have generic read access are removed from the list.
  5. After ABE has completed its processing, the client receives the filtered list.

This yields two effects:

  • This comparison is an active operation and thus consumes CPU Cycles.
  • This comparison takes time, and this time is passed down to the User as the results will only be sent, when the comparisons for the entire directory are completed.

This brings us directly to the core point of this Blog:

In order to successfully use ABE in your environment you have to manage both effects.

If you don’t, ABE can cause a wide spread outage of your File services.

The first effect can cause a complete saturation of your CPUs (all cores at 100%).

This does not only increase the response times of the Fileserver to its clients to a magnitude where the Server is not accepting any new connections or the clients kill their connection after not getting a response from the server for several minutes, but it can also prevent you from establishing a remote desktop connection to the server to make any changes (like disabling ABE for instance).

The second effect can increase the response times of your fileserver (even if its otherwise Idle) to a magnitude that is not accepted by the Users anymore.

The comparison for a single directory enumeration by a single user can keep one CPU in your server busy for quite some time, thus making it more likely for new incoming requests to overlap with already running ABE calculations. This eventually results in a Backlog adding further to the delays experienced by your clients.

To illustrate this let’s roll some numbers:

A little disclaimer:

The following calculation is what I’ve seen, your results may differ as there are many moving pieces in play here. In other words, your mileage may vary. That aside, the numbers seen here are not entirely off but stem from real production environments. Performance of Disk and CPU and other workloads play into these numbers as well.

Thus the calculation and numbers are for illustration purposes only. Don’t use it to calculate your server’s performance capabilities.

Let’s assume you have a DFS Namespace with 10,000 links that is hosted on DFS Servers that have 4 CPUs with 3.5 GHz (also assuming RSS is configured correctly and all 4 CPUs are used by the File service: https://blogs.technet.microsoft.com/networking/2015/07/24/receive-side-scaling-for-the-file-servers/ ).

We usually expect single digit millisecond response times measured at the fileserver to achieve good performance (network latency obviously adds to the numbers seen on the client).

In our scenario above (10,000 Links, ABE, 3.5 Ghz CPU) it is not unseen that a single enumeration of the namespace would take 500ms.

CPU cores and speed DFS Namespace Links RSS configured per recommendations ABE enabled? Response time
4 @ 3.5 GHz 10,000 Yes No <10ms
4 @ 3.5 GHz 10,000 Yes Yes 300 – 500 ms

That means a single CPU can handle up to 2 Directory Enumerations per Second. Multiplied by 4 CPUs the server can handle 8 User Requests per Second. Any more than those 8 requests and we push the Server into a backlog.

Backlog in this case means new requests are stuck in the Processor Queue behind other requests, therefore multiplying the wait time.

This can reach dimensions where the client (and the user) is waiting for minutes and the client eventually decides to kill the TCP connection, and in case of DFSN, fail over to another server.

Anyone remotely familiar with Fileserver Scalability probably instantly recognizes how bad and frightening those numbers are.  Please keep in mind, that not every request sent to the server is a QUERY_DIRECTORY request, and all other requests such as Write, Read, Open, Close etc. do not cause an ABE calculation (however they suffer from an ABE-induced lack of CPU resources in the same way).

Furthermore, the Windows File Service Client caches the directory enumeration results if SMB2 or SMB3 is used (https://technet.microsoft.com/en-us/library/ff686200(v=ws.10).aspx ).

There is no such Cache for SMB1. Thus SMB1 Clients will send more Directory Enumeration Requests than SMB2 or SMB3 Clients (particularly if you keep the F5 key pressed).

It should now be obvious that you should use SMB2/3 versus SMB1 and ensure you leave the caches enabled if you use ABE on your servers.

As you might have realized by now, there is no easy or reliable way to predict the CPU demand of ABE. If you are developing a completely new environment you usually cannot forecast the proportion of QUERY_DIRECTORY requests in relation to the other requests or the frequency of the same.

Recommendations!

The most important recommendation I can give you is:

Do not enable ABE unless you really need to.

Let’s take the Users Home shares as an example:

Usually there is no user browsing manually through this structure, but instead the users get a mapped drive pointing to their folder. So the usability aspect does not count.  Additionally most users will know (or can find out from the Office Address book) the names or aliases of their colleagues. So there is no sensitive information to hide here.  For ease of management most home shares live in big namespace or server shares, what makes them very unfit to be used with ABE.  In many cases the user has full control (or at least write permissions) inside his own home share.  Why should I waste my CPU Cycles to filter the requests inside someone’s Home Share?

Considering all those points, I would be intrigued to learn about a telling argument to enable ABE on User Home Shares or Roaming Profile Shares.  Please sound off in the comments.

If you have a data structure where you really need to enable ABE, your file service concept needs to facilitate these four requirements:

You need Scalability.

You need the ability to increase the number of CPUs doing the ABE calculations in order to react to increasing numbers (directory sizes, number of clients, usage frequency) and thus performance demand.

The easiest way to achieve this is to do ABE Filtering exclusively in DFS Domain Namespaces and not on the Fileservers.

By that you can add easily more CPUs by just adding further Namespace Servers in the sites where they are required.

Also keep in mind, that you should have some redundancy and that another server might not be able to take the full additional load of a failing server on top of its own load.

You need small chunks

The number of objects that ABE needs to check for each calculation is the single most important factor for the performance requirement.

Instead of having a single big 10,000 link namespace (same applies to directories on file servers) build 10 smaller 1,000 link-namespaces and combine them into a DFS Cascade.

By that ABE just needs to filter 1,000 objects for every request.

Just re-do the example calculation above with 250ms, 100ms, 50ms or even less.

You will notice that you are suddenly able to reach very decent numbers in terms of Requests/per Second.

The other nice side effect is, that you will do less calculations, as the user will usually follow only one branch in the directory tree, and is thus not causing ABE calculations for the other branches.

You need Separation of Workloads.

Having your SQL Server run on the same machine as your ABE Server can cause a lack of Performance for both workloads.

Having ABE run on you Domain Controller exposes your Domain Controller Role to the risk of being starved of CPU Cycles and thus not facilitating Domain Logons anymore.

You need to test and monitor your performance

In many cases you are deploying a new file service concept into an existing environment.

Thus you can get some numbers regarding QUERY_DIRECTORY requests, from the existing DFS / Fileservers.

Build up your Namespace / Shares as you envisioned and use the File Server Capacity Tool (https://msdn.microsoft.com/en-us/library/windows/hardware/dn567658(v=vs.85).aspx ) to simulate the expected load against it.

Monitor the SMB Service Response Times, the Processor utilization and Queue length and the feel on the client while browsing through the structures.

This should give you an idea on how many servers you will need, and if it is required to go for a slimmer design of the data structures.

Keep monitoring those values through the lifecycle of your file server deployment in order to scale up in time.

Any deployment of new software, clients or the normal increase in data structure size could throw off your initial calculations and test results.

This point should imho be outlined very clearly in any concept documentation.

This concludes the first part of this Blog Series.

I hope you found it worthwhile and got an understanding how to successfully design a File service with ABE.

Now to round off your knowledge, or if you need to troubleshoot a Performance Issue on an ABE-enabled Server, I strongly encourage you to read the second part of this Blog Series. This post will be updated as soon as it’s live.

With best regards,

Hubert

Viewing all 29 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>