Check out our discord at https://discord.gg/3u69jMa 

Replication De-Obfuscation

From SWRC Wiki
Revision as of 12:14, 19 May 2024 by Plasma (talk | contribs)
Jump to navigation Jump to search

Replication De-Obfuscation Introduction

Ask any halfway decent UnrealScript programmer what the most confusing aspect of Unreal coding, and he'll be sure to tell you replication. (or perhaps PlayerMovement or pawn/bot AI :) I've been there myself, and I still have trouble with it sometimes, but I think I know enough to write some documentation. Lots of trial and error helped me get far, along with various insights from people that had the source code to the engine. I've tried to piece it all together to create this document. I have however, recently gotten a lot more information about exactly how replication works, which has helped me immensely in understanding replication. Since I can't post that information directly because of NDA, I'll try my best to explain it to you through this networking documentation. Hopefully I'll be able to expand upon what was described in Epic's documents at: http://unreal.epicgames.com/Network.htm and http://unreal.epicgames.com/UTMods.html. Please read it first, as it will help a lot with background and context. In writing this document, I'm going to assume you've programmed in UnrealScript before, you have at the very least considered the problems of replication, you are reasonably intelligent, you know how to read, and you think my jokes are funny.

First, Unreal Tournament uses a client-server architecture. The client is basically a dumb terminal, that sends data to the server about where the player moved, if they're jumping, if they're shooting, etc. It displays to the user what is given to it by the server. The client's information is necessarily out of date with what exists on the server, because of lag, ping time, and other factors. So in order to present a believable world, the client has logic to simulate what it thinks the server will be sending it. It knows the laws of physics, and that if something is travelling in a straight line, then it will contine to travel with that same velocity. Likewise, if it's falling through the air, it's likely to continue to be accelerated under the force of gravity. The dumb terminal predicts where how the game state will be progressing, and so can provide the user with a more believable game. On the other end of the connection is the server, which holds the authoritative state of what's going on in the game. If the server thinks you are dead, then you are dead. Like your teachers, they're never wrong, no matter how much evidence you've seen to the contrary (like how you *thought* you dodged that rocket). The client gets information sent to it by the server (like where other people are, what things have been shot, etc) , and sends other information back to the server (like where the client wants to move, shoot, etc). In the client-server architecture, the client only communicates with the server, and the server communicates with all of the clients (people connected to the server).

Server as Authority

With the recent brew-ha-ha regarding the ZeroPing mod, which aims to give the client more control, there have been two sides to the issue of why the server as authority is the best model.

A good description of the history of the networking architectures in games can be found at the top of the tech page's networking document. Go read the "History" section now, and then come back. The current model is that of a client-server architecture, where the client has prediction to minimize the effects of lag. In current versions of UT, the client tells the server all about it's input. Where is the user looking, in which direction are they moving, and are they firing primary or secondary fires.. The server then uses that data to calculate information about the player, and to tell other clients about that player. For the most part, that is all there is to it. ZeroPing pioneered an idea of giving the client more power. It allowed the client to decide whether they hit the other player. In theory, this sounds great. The user gets to decide what's happening, and if they hit the player, then they hit the player. Unfortunately, there are problems with this idea.

In a large-scale game like UT or Q3A, many people try to cheat. The people that make the game have to prevent that from happening. With these games, the client does so little logic, that there is little room to cheat. The client sends its data to the server, and the trusted server decides whether they hit the player or not. Cheats are extremely limited in what they can do, as they can only modify the direction the player is going, and whether to cheat or not. It's not something that is really worthwhile. However, when you have something like ZeroPing, you are letting the client decide whether or not you hit the player. A cheat can pretend that it hit every player, and can wreck havoc in such a game, where it is killing everyone the moment they enter view, and sometimes even getting repeated kills ten times in a row on a single player as they respawn and then are 'hit' again. Mods like ZeroPing are mods, and so do not ahve the same public appeal that a full game does. As such, mods are less likely to be hacked for cheats. But it does not make them immune. If game makers were to use the approach that ZeroPing uses, there would be a flood of cheats within weeks of the game's release. You can understand the resistance that the game companies have for such an approach. It is not because it's the 'not-invented-here' syndrome as many have suggested, but rather that it really is much more open to cheating.

As a side note, I guess the ZeroPing author does realize this. He has hidden his code to make it impossible to cheat with. He took the security through obfuscation approach. This will work for awhile, until code decompilers become available, or the code gets out. While I am not working on any of these myself, I do know of some UnrealScript code decompilers in the works. So this approach really will not work in the long run.

Replication Overview

Replication deals with how information is sent from the server to the client and vice versa. Because of bandwidth limitations for modems, not everything can be sent, and because of the unpredictability of the Internet, not everything can be ensured of arriving on time. Only a subset of all the information is sent to the client, and replication deals with that subset that travels from the server to the client, and deals with how the client sends its relevant information back to the server.

While the definition of replication may sound simple, there is a lot of underlying assumptions and quirks that you must be aware of, which UT uses in the interest of optimum bandwidth and speed. It's an unmapped jungle out there, and many an explorer has been lost to the dangerous native inhabitants. This guide you are reading is your map and survival guide.

First, the Unreal server does some logic like the following for each client connected:

Run all events (like tick, timer, hitwall, etc)
Go through each client connected to the server
Get the list of relevant actors for this client.
For each actor, replicate variables that need to be replicated.
Replicate any function calls that should be replicated to that particular client, if they were called during the previous server-side events run above. 

The client does the following logic:

Receive data from the server, and update its local copies of the actors
Run any replicated calls that were called on the server
Run any simulated events (like tick, timer, hitwall, etc), and do the local playerpawn handling
Find out what replicated were called on the client and need to be replicated to the server, and send them 

If that didn't make any sense, don't worry about it. They just outline the main parts of replication, and they will all be explained later.

If you did understand that, there's still plenty to learn. UT has a lot of logic going on under the hood in regards to replication, most of which is not immediately apparent. That was the purpose of this document, to explain these mysteries to you, the reader, so that your networking development will go that much more smoothly. Again, all this will be explained in due time, assuming you read the whole document. :)

Relevant Actors

First, we need to define what a relevant actor is. A client, as the dumb terminal, needs to know about everything that it will need to display on screen. It needs to know about the other players, and weapons in the level, if any projectiles are visible, if the flag is visible, health packs, etc. All of these actors constitute the relevant set of actors. how is the relevant set of actors determined? If an actor is not relevant, no amount of replication statements or voodoo magic will cause its information to be sent to the client. It must be relevant to that particular client in order for data to be sent to that client. For Unreal / Unreal Tournament, the method of determining relevant actors is as follows (ripped from the tech page's networking document):

The following logic is performed on each actor, and you should be able to tell whether it is relevant. The actor must pass two discrete sets of steps in order to be considered relevant. One it passes the first test, it must then pass the second. The rules higher in the list have precedence over rules that follow them:

If it is bNetTemporary, then it does not pass the first stage.
If it is not bStatic and not bNoDelete, then it passes the first stage.
If it is a LevelInfo, it passes the first stage.
If it is a temporary network actor, (as defined by bNetTemporary), and it's been sent to this client already, then it does not pass the first stage. These types of actors will be decribed in more detail later.
If its Role is ROLE_None, then does not pass the first stage.
If it is bAlwaysRelevant, and time since the last relevancy is less than 1 / NetUpdateFrequency (described farther below) seconds, then it does not pass the first stage. This basically means an actor will only update it's variables to the client NetUpdateFrequency times per second.(NetUpdateFrequency values: Actor: 100, ZoneInfo: 4, GameReplicationInfo: 4, PlayerReplicationInfo: 2, or Inventory: 8).
If it is bNetOptional, and it has a non-zero LifeSpan (meaning it does not exist forever), and it is not in the first 150 ms of its lifetime, then it passes the first stage. Any attempt at replicating a bNetOptional actor after the first 150 ms, (because it was pushed off the replication list due to bandwidth contraints for that client, etc,) will fail.
If the time since the last relevancy is greater than 1 / ~NetUpdateFrequency seconds, it passes the first stage. (This is a more finely tuned optimization of NetUpdateFrequency, tuned to the type of actor. It will only perform the calculations needed if it has passed through the weeding techniques above.)
It does not pass the first stage. 

If it passed all of those steps, it's not through and clear to be relevant yet. It has another step to go through. If it is deemed relevant during the following steps, then it is relevant to this client.

If it is bAlwaysRelevant, then it is relevant.
If it is owned by the client or owned by the actor we are viewing through (as returned by PlayerCalcView), then it is relevant.
If it is the client or the actor we are viewing through (as returned by PlayerCalcView), then it is relevant.
If its Instigator is the client, then it is relevant.
If it has an ambient sound and is within 'audible distance', then it is relevant.
If it is a weapon that is owned by a visible Pawn owner, then it is relevant.
If it is bHidden or and not bBlockPlayers and the AmbientSound is none, then the actor is not relevant.
If it is not relevant (per the above tests), but if any of the above conditions made it relevant within the last RelevantTimeout seconds, (by default, five seconds, but is defined in UnrealTournament.ini:[IpDrv.TcpNetDriver],) then it is relevant.
It is not relevant. 

Those rules, while seeming complicated, are exactly how UT determines what is relevant to each client. Many times such checks are not related to what you are doing, but they are always good to know. Thinking about it, it makes sense. If something is not visible, then the client probably does not need to know about it. If it's a player on the other side of CTF-LavaGiant, what use does this client have for it? Or if it's an invisible pathnode used for Bot AI, the client also has no use for it. Sending data and variables for such actors would be wasteful on a tiny little 28.8 kbps modem.

Working this discussion into a guideline for your own code, you should set your actors' bAlwaysRelevant to true if they are data storage units that the client needs to know about, like PlayerReplicationInfo or GameReplicationInfo. Things like PlayerReplicationInfo, which store information about the player, are always relevant. This allows a client to view the Scoreboard and see the information (name, score, ping, packetloss, etc) for everyone in the level, regardless of whether that player actor is relevant to the client. PlayerReplicationInfo's allow precious information that's small, bandwidth wise, to be relevant and replicated to each client, yet have information like the location, velocity, weapon, animations, etc of the player to not be replicated when it's not necessary (because it's in the Pawn/PlayerPawn, which is not always relevant.) Hopefully I've not lost you here. It's an important thing to know and remember about Unreal replication. Where to put your variables, and how to make them relevant to the client. If you needed some important game information to be seen by all clients, your best bet would be to subclass ReplicationInfo and utilize that for your data variables. Information that is only needed when it's seen is best put in regular actors, that will go in and out of relevancy as time passes.

In a nutshell, if the client needs to know about it, they will. All inventory is relevant to the client, as is all nearby players. All projectiles and weapon effects that are visible are relelvant. Anything that affects the player's world on the client, will be replicated. This includes things that block the player's movement, or actors that play ambient music. There is an overhead to spawning a new actor in netplay, and so UT does not immediately remove an actor from the relevancy list when it goes out of view. It gives it a few seconds before it is really considered not relevant. Oh, and UT takes into account the fact that a player may be viewing through another actor's eyes, as in the case of the GuidedWarhead, or the teammate views when you hit the number keys on the right side of your keyboard.

NetPriorities

While that's the "simple" description, figuring out why things happen can get a bit more complicated. Unreal Tournament gives different priorities for different types of actors. NetPriority, which should typically be between 1 and 3, determines how to prioritize a given actor's bandwidth "allocation" in the stream being sent to each client. Actors are first sorted by a calculation based upon their priority (adjusted to take into account how close the client is looking at them, etc.) Actors marked bNetOptional are sent to the bottom of the list. So....actors with a higher priority are sent first in the bandwidth stream, and when the connection becomes saturated, any remaining actors are starved of bandwidth until later. (UT remembers which actors have not been replicated, and so it can give bNetOptional actors 150 ms to get replicated before they are discarded. See the above relevancy list for more information.)

Here's a wide and almost complete sampling of UT's NetPriorities:

Bots:            3.0
PlayerPawns:     3.0
GuidedRedeemer:  3.0
Movers:          2.7
Projectiles:     2.5
Carcasses:       2.5
Other Pawns:     2.0 (Spectators, Titans, Bunny Rabbits, etc)
Effects:         2.0 (sparks, blood, smoke)
Simpler effects: 1.4 (shell casings)
Inventory:       1.4 or 2.5
Everything else: 1.0 

This shows that the server considers a Player's (be it bot or playerpawn) information to be much more important than something like bullet casings from a gun. And it makes sense. A guided redeemer also has unpredictability to it, since it is controlled by a player, and thus needs a higher NetPriority. A projectile follows the simple Physics given to it, eg: PHYS_Falling, PHYS_Pojectile, and so they are not very unpredictable. Thus, its NetPriority is not as high as a player's. However, projectiles are still quite important to gameplay (doesn't it suck to get hit by that rocket you never saw? :), and they get a high NetPriority. It's the same with the carcasses, since their animations as they fly through the air or fall to the ground are quite important (not to gameplay, but to that touchy-feel-good feeling when you kill someone.) You should use these examples as a guide when creating your own actors that need a different priority.

One additional thing to note is that you can change the NetPriority on the fly. This is done with Inventory, for example. It exists as 1.4 for most of its life, but when it is thrown from a player's inventory, it now is a moving actor with more importance attached to it. It is then given a higher NetPriority. Once it hits the ground, it gets a NetPriority of 1.4 again, since it does not need to be updated as often when it is sitting stationary on the ground.

In general, all of the relevant actors for a particular client are put in an array each tick, and then sorted by their priorities. The server then starts replication the actors, starting with the highest NetPrioritized actor, working it's way down. UT then stops replicating actors when the bandwidth becomes saturated. That's why it's important to prioritize your actors correctly, so that when a lot of actors are on screen, or on the lowly 28.8K connections, UT will know how to perform triage, and who to starve of bandwidth first. If an actor is not replicated, the variables it was supposed to tell the client about are not forgotten. They will be put on the list during the next tick, to have another go at getting replicated. Note that doubling the size of the actor's NetPriority does not actually change anything in regards to giving it more bandwidth, since a NetPriority only has meaning in relation to other NetPriorities. Since 3.0 is the highest on the list, there is no real difference between giving your actor a NetPriority of 4.0 and 4000.0, since they will both be placed at the top of the list.

Roles and RemoteRoles Introduction

Now is a good a time as any to explain the purposes of the various roles in Unreal. Roles simply guide UT in determining what role this particular actor plays in the game of replication. For example, a player needs to be replicated quite differently from an ammo box sitting on the ground, or a projectile moving according to the rules of physics. On a server, the Role will be ROLE_Authority, which makes sense since that actor is currently being evaluated on the authoritative server. The RemoteRole of that actor will be set to any of a variety of roles, defined below. On the client, the Role and RemoteRole will be reversed. On the client, the RemoteRole will be ROLE_Authority, and the Role will be one of the other role types. Let's go through the various roles (which will be set in the RemoteRole default properties), in order of increasing 'control.'


ROLE_None

This role is probably the simplest. It simply tells the server not to make this actor relevant to the client. No information will be sent about it, and it will exist where it was created only. If it was created on both the client and the server (via a simulated function, described later), and given a Role of ROLE_None, then it would exist both on the server and on the client, and they would both operate independently of each other. 


ROLE_DumbProxy

This role is used for objects that don't move much, yet still need information about them updated to the client. The server does this by sending periodic updates about this actor's location and velocity to the client. If the actor has any sort of logic to it, with Physics, for example, it's much better to use the following role. 


ROLE_SimulatedProxy

This is used for anything that should be simulated over the network, via prediction. Take a rocket, for example. It always travels in a straight line, and is an ideal candidate for SimulatedProxy. By the same token, UT can predict the falling physics clientside, and so grenades, and falling people are also good candidates for this role. Anything that can be predicted clientside because of Physics or clientside physics should use this Role. All Bots and PlayerPawns use this role as well, since when a player is running, he's more than likely to continue to run. It's the best prediction method that can be used for unpredictable players. The only exception to this is the playerpawn you yourself are playing as. 


ROLE_AutonomousProxy

This is used for you. When you play online, you yourself should be treated differently from all other playerpawns. You don't want your own self being predicted based upon the server's expectations. Rather, you want to control yourself and tell the server where you are moving. With this Role, that's easily possible. Many things that should only be done for your own playerpawn, like server-corrections of your location when you get lagged out, are only done with this role. 


ROLE_Authority

This role is different from the others in that it is always the Role on the server. On the server, all Roles will be ROLE_Authority, and the RemoteRole will be one of the above. On the client, the reverse will be true. This role doesn't have any real significance, beyond it's use in replication statements, described below. 

General Replication

For each actor that is deemed to be relevant to a particular client, the server needs to determine what subset of its variables will be sent to the client. The most obvious rule is to only send variables if they change. Unfortunately, that rule alone is not enough to fit data over a modem. Instead, the server has to further whittle down its list of variables, and it does this through replication statements. These replication statements are executed for each client, for each relevant actor to that client. When a subset of the variables of a subset of the actors are sent over the network, only then can it be fit on a regular modem. But how can a client operate without all of the information in the game? Once you stop and think about it, it's not quite that unthinkable. Do all of a Bot's AI variables need to be replicated to the client? Only the result of the AI needs to be sent, which basically consists of the same thing for any other Pawn in the game. Also, depending upon the RemoteRole, different things are replicated. Some cause the Location to be updated periodically as a poor man's simulation of its motion, while another role causes the physics to be replicated, and the location not to be, creating true client-side simulation of the actor's motion.

No Recursive Relevancy

Please note one important point that can easily be overlooked. Relevancy is not recursive with replicated references to variables. Let's say that Actor A contains a reference to Actor B in Property C, and Actor A replicates Property C (currently set to Actor B). Let's also say that Actor A is relevant, and Actor B is not. (Don't worry, there's only three letters in this example.) In this example, Actor A is relevant, and so it's Property C variable will be replicated to the client. Now what does the client know? The client knows about Actor B through Actor A's reference, but Actor B is not relevant. All of Actor B's variables are not checked for replication. The client only knows about Actor B, which really is of no use whatsoever, since the variables are not replicated. I initially thought that replicated variables to actors caused those actors to be relevant and parsed themselves for replicated variables, but that view is false. An actor can only be relevant according to the rules stated above, which make no provision for a reliable reference to them from a relevant actor.

Replication Checks

When you think about the number of clients in a server, and the number of relevant actors for each client, and the number of replication checks on variables for each actor, there's a large number of such checks. Special care should be taken when implementing replication statements. As each conditional is evaluated, you should be considerate of the server's limited processing ability. Don't call functions from within replication statements, as function calls are much too slow for replication statements. Along the same lines, you should try to keep your checks as simple as possible, only evaluating a condition if its necessary. Finally, note that Epic had a problem with replication checks taking too long for some important actors like Actor or Inventory, both of which replicate many variables. To alleviate the situation on servers, Epic moved the checks to native code. That's why you will see the 'nativereplication' on a few of the Engine classes. Any variables being evaluated in those classes will be covered by the nativereplication code, making any UnrealScript replication statements useless in that class, (except as a form of documentation for UnrealScript programmers.) Any variables that are not defined in a nativereplication class, (defined in a non-nativereplication subclass or superclass,) will be replicated via the unrealscript definitions. Despite the fact that the nativereplication checks are in C++, the UnrealScript checks that remain in those classes are still accurate, and a good guide to go by in determining problems with replication, and ways around it.

Special Replication Variables

When writing replication conditionals, there are a few things you should be aware of. During replication, a few helpful variables are set for your use in evaluating whether it should be replicated. These variables are defined in actor, and are: