RunUO Community

This is a sample guest message. Register a free account today to become a member! Once signed in, you'll be able to participate on this site by adding your own topics and posts, as well as connect with other members through your own private inbox!

PacketWrite prototype

PacketWrite prototype

So yeah, just thought I'd put this up here incase someone else comes across a similer problem.

Most of my PacketWriter class for Seasons deals with taking input, converting it to a series of bytes, writing it into a buffer then flushing that buffer across the network.

Origionally I was doing this with bit-shifting operations, which looks horrible and can be fairly cryptic, not to mention it dies when you shift to a different architecture.

Code:
        public void Write(int toWrite)
        {
            if ((this.m_Index + 4) > this.m_BufferLength)
            {
                this.Flush();
            }
            
			this.m_Buffer[this.m_Index++] = toWrite >> 0x18;
			this.m_Buffer[this.m_Index++] = toWrite >> 0x10;
			this.m_Buffer[this.m_Index++] = toWrite >> 8;
			this.m_Buffer[this.m_Index++] = toWrite;
            
        }

Yeah, pretty fugly. After some MSDN'ing I came across a much cleaner way to do it. Behold, System.BitConverter.GetBytes() :D

Code:
		public void Write(int toWrite)
		{
			if ((this.m_Index + 4) > this.m_BufferLength)
			{
				this.Flush();
			}

            byte[] bytes = System.BitConverter.GetBytes(toWrite);

            this.m_Buffer[this.m_Index++] = bytes[0];
            this.m_Buffer[this.m_Index++] = bytes[1];
            this.m_Buffer[this.m_Index++] = bytes[2];
            this.m_Buffer[this.m_Index++] = bytes[3];
		}

Obviously for types with more or less bytes in them (Short is 2 bytes for example) you just access less of the array. I could probably make this cleaner by inserting the bytes array into the Buffer array inplace then incrementing by sizeof(bytes), but I think anything that removes the >> and << operations is cleaner.
 

Jeff

Lord
So you traded in readability for more overhead? How .Net of you... If you don't like <<, >>, &, |, ^ (bitwise ops) you need to get out of programming in C# and move to VB.. What you did just added time..You should put back the bit shift operations, it is faster, and thats what you want in networking code...speed. Just cause YOU cannot read it, doesn't make it wrong or any less right.
 
I don't agree. I'd rather write readable code first, even if it is slightly slower, then after the project's code base is stable and compiling/running without exceptions, then I'd start gathering performance metrics and optimize where required. Pre-optimizing code can lead to spending days on sections of code that don't get as much use as the programmer thinks.

Results with 100,000,000 iterations. (In seconds)
BitShifted GetBytes
9.3672343 24.6662807
9.9647801 25.6711347
9.4059105 24.5241291
----------------------------------------
AVG: 9.5793083 24.95384817 (2.604x slower)

Obviously using a right shift is faster than the ".net way", division of the average result against the iteration gives BitShifting a 0.000000095793083s per split and GetBytes a 0.0000002495384817s per split.

I would go back and replace it with a bit-shifting operation, if I found the networking code was causing enough of a slowdown in the project.

(Not performing thread necromancy, been working alot lately at work -.-)
 

arul

Sorceror
GodOfThePookies;830720 said:
I don't agree. I'd rather write readable code first, even if it is slightly slower, then after the project's code base is stable and compiling/running without exceptions, then I'd start gathering performance metrics and optimize where required. Pre-optimizing code can lead to spending days on sections of code that don't get as much use as the programmer thinks.

Results with 100,000,000 iterations. (In seconds)
BitShifted GetBytes
9.3672343 24.6662807
9.9647801 25.6711347
9.4059105 24.5241291
----------------------------------------
AVG: 9.5793083 24.95384817 (2.604x slower)

Obviously using a right shift is faster than the ".net way", division of the average result against the iteration gives BitShifting a 0.000000095793083s per split and GetBytes a 0.0000002495384817s per split.

I would go back and replace it with a bit-shifting operation, if I found the networking code was causing enough of a slowdown in the project.

(Not performing thread necromancy, been working alot lately at work -.-)
There is nothing subtle about the bit-shift code, plus the code has been in production for several years now (with close to none changes done). If you take a peek at the internal implementation of the BitConverter class, you'll find out that it's using even dirtier tricks!

Code:
[COLOR=#1000a0]fixed[/COLOR] ([URL="http://www.aisto.com/roeder/dotnet/Default.aspx?Target=code://mscorlib:2.0.0.0:b77a5c561934e089/System.Byte"]byte[/URL]* [B]numRef[/B] [COLOR=#1000a0]=[/COLOR] buffer)     {         
    *(([URL="http://www.aisto.com/roeder/dotnet/Default.aspx?Target=code://mscorlib:2.0.0.0:b77a5c561934e089/System.Int32"]int[/URL]*) numRef) = value;     
}
This is logically the fastest way possible ...

Is it more readable? I can only tell that the code is shorter, which suggests simpler code, even though the semantics behind it get quite complex. Can you work with pointers .. ?

Now, take a look at the code you've written, and at the thing you've noted about it:

After some MSDN'ing I came across a much cleaner way to do it.
Spot the problem here? You're contradicting yourself. The entire BitConverter class with its GetBytes() method acts like a black-box here. Seeing a black-box like this makes you think about it's underlying implementation details, which you now need to look up by either reading the docs or peeking at the code in Reflector. And the "readability" goes down the tubes ...
 

Jeff

Lord
Pointing out statistics, but not showing the code used to provide these statistics is just dumb. Next time, show the code, or don't post such things...who knows, perhaps you didn't do something properly, thus providing false and/or invalid results.
 

Tartaros

Wanderer
GodOfThePookies;823708 said:
it dies when you shift to a different architecture.

there's no "different architecture" for C#, there's just the one defined in .NET standard.


I'd have guessed using the BitConverter way, which doesn't even do any bitshifting would be faster... Have you tried to test it's speed using Buffer instead of the 4 assignments (which would be the correct way here) ?
 
Jeff;834087 said:
Pointing out statistics, but not showing the code used to provide these statistics is just dumb. Next time, show the code, or don't post such things...who knows, perhaps you didn't do something properly, thus providing false and/or invalid results.

I overlooked that, oops.
I've been running this in debug mode, for a 64bit platform.

Test machine is a AMD X2 5700+ (2.7Ghz) with 4GB DDR2 PC3200 running Windows 7 Ultimate 64bit.
Code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            Tester testing = new Tester();
            testing.RunTest();
        }
    }

    public class Tester
    {
        private int m_Index;
        private int m_BufferLength = 512;
        

        public void RunTest()
        {
            System.Console.WriteLine("Testing System.BitConverter.GetBytes() against Bit-shifting");
            System.Console.WriteLine("Test run of BitShifted:");

            // 100,000,000 runs should return an average enough result. (And show marked differences)


            Random RNG = new Random();

            Stopwatch watch = new Stopwatch(); watch.Start();

            // Little bit of overhead from the RNG, but I don't think it will affect the results too much.
            for (int i = 0; i < 100000000; i++)
            {
                Bitshifted(RNG.Next());
            }
            
            watch.Stop(); System.Console.WriteLine("Bitshifted finished in {0}", watch.Elapsed.ToString());

            System.Console.WriteLine("Test run of GetBytes:");

            watch.Start();

            // Same as above, but at least its equal overhead on both methods.
            for (int i = 0; i < 100000000; i++)
            {
                Managed(RNG.Next());
            }

            watch.Stop(); System.Console.WriteLine("GetBytes finished in {0}", watch.Elapsed.ToString());

            System.Threading.Thread.Sleep(10000);
        }

        public void Flush()
        {
            // Stubbed.
            return;
        }

        /// <summary>
        /// This method uses bit-shifting to return a bytestream.
        /// </summary>
        /// <param name="toWrite"></param>
        /// <returns></returns>
        public byte[] Bitshifted(int toWrite)
        {
            m_Index = 0;
            byte[] TheBuffer; TheBuffer = new byte[8];
            

            if ((this.m_Index + 4) > this.m_BufferLength)
            {
                this.Flush();
            }

            TheBuffer[this.m_Index++] = (byte)(toWrite >> 0x18 & 0xFF);
            TheBuffer[this.m_Index++] = (byte)(toWrite >> 0x10 & 0xFF);
            TheBuffer[this.m_Index++] = (byte)(toWrite >> 0x8 & 0xFF);
            TheBuffer[this.m_Index++] = (byte)(toWrite & 0xFF);

            return TheBuffer;
        }

        /// <summary>
        /// This method uses System.BitConverter.GetBytes() to return a bytestream.
        /// </summary>
        /// <param name="toWrite"></param>
        /// <returns></returns>
        public byte[] Managed(int toWrite)
        {
            m_Index = 0;

            byte[] TheBuffer; TheBuffer = new byte[4];

            if ((this.m_Index + 4) > this.m_BufferLength)
            {
                this.Flush();
            }

			// Apply the returned data directly, rather than calling it four times.
            TheBuffer = System.BitConverter.GetBytes(toWrite);

            return TheBuffer;
        }
    }
}

there's no "different architecture" for C#, there's just the one defined in .NET standard.

What I meant by "different architecture" was moving from a Big Endian CPU to a Little
Endian machine, such as running mono on a sun SPARC(Big Endian), Windows (x86/64) is Little Endian.
Unless unbeknown to me, .NET shields you from this somehow?

I admit that GetBytes is significantly slower than BitShifting, which seems odd. At first, I figured it was down to the
stack and context switches from calling into the GetBytes function so frequently.

So I had a look at the code for System.BitConverter.GetBytes(int):

Code:
public static unsafe byte[] GetBytes(int value)
{
    byte[] buffer = new byte[4];
    fixed (byte* numRef = buffer)
    {
        *((int*) numRef) = value;
    }
    return buffer;
}

First glance seems like that'd be faster than the bitshifting, its just a straight up memory mangle rather than a bunch
of maths. It should only take a few assembly instructions to perform that operation. (And it does)
So I modified the testing code and planted the innards of GetBytes straight into my test code, to reduce the effect of
calling into another method so frequently. Fired up the test with the modified Managed(int) method and..

Making Managed() into:
Code:
        /// <summary>
        /// This method uses System.BitConverter.GetBytes() to return a bytestream.
        /// </summary>
        /// <param name="toWrite"></param>
        /// <returns></returns>
        public unsafe byte[] Managed(int toWrite)
        {
            //m_Index = 0;

            //if ((this.m_Index + 4) > this.m_BufferLength)
            //{
            //    this.Flush();
            //}

            byte[] buffer = new byte[4];
            fixed (byte* numRef = buffer)
            {
                *((int*)numRef) = toWrite;
            }
            return buffer;
        }

Returns:

Code:
Testing System.BitConverter.GetBytes() against Bit-shifting
Test run of BitShifted:
Bitshifted finished in 00:00:08.8765452
Test run of GetBytes:
GetBytes finished in 00:00:14.3915536

Its still slower. Poking around in the assembly gets this:

Code:
            TheBuffer[this.m_Index++] = (byte)(toWrite >> 0x18 & 0xFF);
00000068  mov         rax,qword ptr [rsp+000000C0h] 
00000070  mov         eax,dword ptr [rax+8] 
00000073  mov         dword ptr [rsp+34h],eax 
00000077  mov         eax,dword ptr [rsp+34h] 
0000007b  mov         dword ptr [rsp+30h],eax 
0000007f  mov         ecx,dword ptr [rsp+34h] 
00000083  add         ecx,1 
00000086  mov         rax,qword ptr [rsp+20h] 
0000008b  mov         qword ptr [rsp+38h],rax 
00000090  mov         rax,qword ptr [rsp+000000C0h] 
00000098  mov         dword ptr [rax+8],ecx 
0000009b  mov         eax,dword ptr [rsp+000000C8h] 
000000a2  sar         eax,18h 
000000a5  and         eax,0FFh 
000000aa  mov         dword ptr [rsp+40h],eax 
000000ae  movsxd      rcx,dword ptr [rsp+30h] 
000000b3  mov         rax,qword ptr [rsp+38h] 
000000b8  mov         rax,qword ptr [rax+8] 
000000bc  mov         qword ptr [rsp+48h],rcx 
000000c1  cmp         qword ptr [rsp+48h],rax 
000000c6  jae         00000000000000D4 
000000c8  mov         rax,qword ptr [rsp+48h] 
000000cd  mov         qword ptr [rsp+48h],rax 
000000d2  jmp         00000000000000D9 
000000d4  call        FFFFFFFFF49A6950 
000000d9  mov         rdx,qword ptr [rsp+38h] 
000000de  mov         rcx,qword ptr [rsp+48h] 
000000e3  movzx       eax,byte ptr [rsp+40h] 
000000e8  mov         byte ptr [rdx+rcx+10h],al

Yeah.. thats just one of the array operations, the other 3 have roughly the same amount of mov's behind them. Which
partly confirms my theory about bitshifting being more operations on the processor.

Looking at the disassembly for the Managed() method gives:

Code:
            fixed (byte* numRef = buffer)
00000060  mov         rax,qword ptr [rsp+20h] 
00000065  mov         qword ptr [rsp+38h],rax 
0000006a  cmp         qword ptr [rsp+20h],0 
00000070  je          000000000000007F 
00000072  mov         rax,qword ptr [rsp+38h] 
00000077  mov         rax,qword ptr [rax+8] 
0000007b  test        eax,eax 
0000007d  jne         000000000000008A 
0000007f  mov         qword ptr [rsp+28h],0 
00000088  jmp         00000000000000C8 
0000008a  mov         rax,qword ptr [rsp+38h] 
0000008f  mov         rax,qword ptr [rax+8] 
00000093  mov         qword ptr [rsp+40h],0 
0000009c  cmp         qword ptr [rsp+40h],rax 
000000a1  jae         00000000000000AF 
000000a3  mov         rax,qword ptr [rsp+40h] 
000000a8  mov         qword ptr [rsp+40h],rax 
000000ad  jmp         00000000000000B4 
000000af  call        FFFFFFFFF49A6660 
000000b4  mov         rcx,qword ptr [rsp+38h] 
000000b9  mov         rax,qword ptr [rsp+40h] 
000000be  lea         rax,[rcx+rax+10h] 
000000c3  mov         qword ptr [rsp+28h],rax 
            {
000000c8  nop              
                *((int*)numRef) = toWrite;
000000c9  mov         rcx,qword ptr [rsp+28h] 
000000ce  mov         eax,dword ptr [rsp+68h] 
000000d2  mov         dword ptr [rcx],eax 
            }
000000d4  nop              
000000d5  mov         qword ptr [rsp+28h],0 
            return buffer;
000000de  mov         rax,qword ptr [rsp+20h] 
000000e3  mov         qword ptr [rsp+30h],rax 
000000e8  jmp         00000000000000EA

So, it seems like the memory copy ( - ((int*)numRef) = toWrite; - ) is only 3 instructions in total, but setting up the fixed( ) requirement
for the source of byte's is what takes so long, the codes a bit spaghetti'fied, but I can at least tell that the jmp's only move around
within the region of the pinning, theres a call that goes somewhere else, I presume thats to tell the GC to pin the memory in place. (KeepAlive()?)

I guess if you could call the memory assignment directly without having to pin the source buffer into memory to protect
it from the GC, maybe GetBytes() would be faster, maybe the compiler is quietly optimizing the code behind the scenes with MMX/SIMD
instructions? My assembly is not as good as it used to be and I can only get the jist of the compiler's code. I know that MOVSXD is part
of the x86_64 additions but the rest looks like plain un-SIMD'd instructions me.

If anyone here understands assembly better, feel free to explain whats going on. :)

So in summary: Yes, Bit shifting is still faster than GetBytes no matter how you mangle it. I still have concerns that the bitshifting'll
break if you moved to a big endian machine, but 220%+ performance seems like a good reason to just branch a conditional for other endian
machines.

I also can't help but wonder if you could make this *really* fast by buffering up all the int's you want to split into bytes and feeding
them into a CUDA/OpenCL program and running them through a stream processor, get the result back as a texture or something.

Also, you could maybe use a cached MemoryStream object to push int's into and get byte[]'s back out, as below, but that'd
probably have an even bigger overhead, and now its just into the realm's over vastly over-engineering a simple problem. :)

Code:
        using (MemoryStream stream = new MemoryStream()) {
            using (BinaryWriter writer = new BinaryWriter(stream)){
                writer.Write(src);
                return stream.ToArray();
            }
        }

Yeah, long post I know, I got interested in the reason why GetBytes was slower even though it looks like it shouldn't. I'm still
using the bit shifts in my normal code, and will continue to do so until fixed() operations become less expensive to call in mass.
 

Tartaros

Wanderer
GodOfThePookies;835473 said:
What I meant by "different architecture" was moving from a Big Endian CPU to a Little
Endian machine, such as running mono on a sun SPARC(Big Endian), Windows (x86/64) is Little Endian.
Unless unbeknown to me, .NET shields you from this somehow?

Yes I know you were talking about endianness, me too.
In C# you don't code for Windows, or SPARC, or anything physical - you code for .NET virtual machine. So yes, you're "shielded somehow".

This is basic C# dev knowledge btw :)


This was quite an interesting problem, but unfortunately you seem to have no idea what you're doing. Common usage of such "encoding" functions is writing data to a byte "buffer" or "stream", so it makes no sense to create a byte array inside the method way you do, unless absolutely necessary.
Plus in the "Managed" method, you create such an array and then you throw it away without using it in any meaningful way, and last but not least you ignored by advice about using the Buffer class :)

Anyway, the endianness is incompatible, GetBytes produces different result from the bitshifting version, so it's useless for the case of .NET and UO. So the whole "research" is useless :p
 

Jeff

Lord
In theory, little endian, big endian shouldn't matter for RunUO, the client is going to send you the data in a specific order (little endian) irregardless...
 

Tartaros

Wanderer
Jeff;835621 said:
In theory, little endian, big endian shouldn't matter for RunUO, the client is going to send you the data in a specific order (little endian) irregardless...

yeah and that's why it does matter
 

Tartaros

Wanderer
if it "didn't matter", you could use the GetBytes implementation.
But since you can't (opposite order), how can you say it doesn't matter?
 
Tartaros;835563 said:
Yes I know you were talking about endianness, me too.
In C# you don't code for Windows, or SPARC, or anything physical - you code for .NET virtual machine. So yes, you're "shielded somehow".

I don't think thats strictly true. Yes, there is a ".Net Virtual Machine" but its not a VM like Sun VirtualBox is, it's not emulating the entire hardware set, only providing a GC, memory management and JIT'ing your code into IL and then native machine code, and a few other things. You still have to deal with the underlaying iron, and in my specific case, thats a server running under Mono (Yes, yes, not .net, but it is an compatible implementation of the .Net CLR) on an old SPARC machine I have in the office here, its more reliable than the development machine and its already running other stuff. *shrug*

So the endianess of operations in my PacketReader/Writer/Stream class, the System.IO.BinaryReader/Writer and a few other classes is of importance to me, and its nothing to do with RunUO, UO or otherwise. I only posted the original prototype to get some comments on it. The BinaryReader/Writer don't even seem to bother with Endian checking, they just assume its all in little endian format. :/

Jeff;835485 said:
Mono is not .Net...
See above, I know Mono isn't .Net, but it is a compatible implementation of the .Net CLR and the supporting framework. (Excluding, obviously the Windows:: namespace)

Tartaros;835563 said:
This was quite an interesting problem, but unfortunately you seem to have no idea what you're doing. Common usage of such "encoding" functions is writing data to a byte "buffer" or "stream", so it makes no sense to create a byte array inside the method way you do, unless absolutely necessary.
Plus in the "Managed" method, you create such an array and then you throw it away without using it in any meaningful way, and last but not least you ignored by advice about using the Buffer class :)

The above posted code was (I thought quite clearly defined as) a test, its not production code, its not part of any real project, its a few methods designed in a pure test environment to measure the performance of bit shifting int's into bytes against using the BitConverter.GetBytes() method. I throw away the buffer because the buffer and its contents isn't important, only the time it takes to complete the byte-mangling.

The Buffer class isn't really appropriate for this situation as it works on arrays rather than individual value-types, if you packed the packet/streams content into chunks and worked them 4-8 ints at a time, I guess it would work well enough.

I tried it anyway, packing the int's into an array of 4, then passing them into the ManagedBuffer() method that uses that method, its kinda fugly though. I may have done something wrong though, nearly a minute to finish the dataset.

Code:
        /// <summary>
        /// This method uses the Buffer class to return a byte array from an int array.
        /// </summary>
        /// <remarks>This pulls the bytes in sequential order, completly ignoring byte-order.</remarks>
        /// <param name="toWrite"></param>
        /// <returns>byte array of the ints passed in.</returns>
        public byte[] ManagedBuffer(int[] toWrite)
        {
            int m_Index = 0;
            byte[] TheBuffer; TheBuffer = new byte[toWrite.Length*4]; // To account for the bigger data size.

            for (int i = 0; i < toWrite.Length; i++)
            {
                // We use i as the byte-buffer index, and m_Index as the array index.
                Buffer.BlockCopy(toWrite, 0, TheBuffer, m_Index, 4);
                m_Index += sizeof(int);
            }

            return TheBuffer;
        }

But due to the packing, it requires a different invocation procedure in the test method.

Code:
            System.Console.WriteLine("Test run of ManagedBuffer:");

            watch.Start();

            Int32[] intArray = new Int32[4];

            // Since buffer only accepts arrays, we pack 
            for (int i = 0; i < Repetitions; i++)
            {
                for (int j = 0; j < 4; j++)
                {
                    intArray[j] = RNG.Next();
                }

                ManagedBuffer(intArray);
            }

            watch.Stop(); System.Console.WriteLine("ManagedBuffer finished in {0}", watch.Elapsed.ToString());

I'm not sure if the packing in-situ ruins the test results, but it should just be creating new ints (free), calling the RNG.Next() method and then passing a 4 element array into the ManagedBuffer() method. If you can think of a cleaner/better way to do it using the Buffer class, feel free. I don't feel its suited to this situation due to the requirement of having array's passed into it.

Jeff;835621 said:
In theory, little endian, big endian shouldn't matter for RunUO, the client is going to send you the data in a specific order (little endian) irregardless...

True, if the client is sending little endian encoded ints (Which it seemingly is) then it doesn't really matter, but, as above, it does matter when files created on the server (big endian) are transmitted in whole to the client (little endian, possibly) and my client isn't the UO client. It works on a platform other than x86 windows. ;)

Also, after looking into it, my comments about SIMD are irrelevant, Microsoft .Net doesn't support SIMD/SSE in any manner at all, Mono does though; through Mono.Simd. I wonder if the test results would be different under Mono.

And yes, I do understand pointers. I get what the GetBytes function is doing. :p
 

Tartaros

Wanderer
MS CLR is not implemented as a VM, but it is defined as such, and behaves as such. And it could be (and perhaps even has been, experimentally or for bootstrapping) been implemented as such.

I don't know what "iron" you're talking about but the truth is that the virtual processor has a predefined endianness that will never change (provided you have a correct CLR implementation), so it would be absurd if BinaryReader/Writer methods somehow tried to check it.


You still don't get what I was talking about when I said you're throwing away an array... so I commented the code here:
Code:
        public byte[] Managed(int toWrite)
        {
            m_Index = 0;

            byte[] TheBuffer; TheBuffer = new byte[4];  //here you create the first 4byte array
//...
            TheBuffer = System.BitConverter.GetBytes(toWrite); //here you throw the first one away and return the one created by the GetBytes function

            return TheBuffer;
        }

So obviously you didn't realize GetBytes creates it's own little byte[] - that's why I was talking about System.Buffer - about using it to copy the data from this GetBytes-created little array to the "official" result buffer, which is usually far bigger than 4 bytes.

I won't comment on your "packing" which doesn't really make sense, you got a little carried away :p
 

Jeff

Lord
Tartaros;835768 said:
if it "didn't matter", you could use the GetBytes implementation.
But since you can't (opposite order), how can you say it doesn't matter?

Well I had meant for Bit Shifting like RunUO currently does... Thats where the misunderstanding is.
 
only reason I notice is name use on my shard. good name that, but Tartaros' profile description to left says account terminated.

Funny that? What have we done wrong?
 

Tartaros

Wanderer
Stygian Stalker;835863 said:
only reason I notice is name use on my shard. good name that, but Tartaros' profile description to left says account terminated.

Funny that? What have we done wrong?

huh?

My "account terminated" is probably something left over from some kind of forum migration in the past plus me having been banned :D
But I kind of didn't get what was your post about.
 
Tartaros
Account Terminated


Join Date: Nov 2002
Posts: 42


Where your name is displayed in post count. that is what I see.

Funny that.

On my shard my Admin name is tartaros.

So I asked what had "we" done wrong this time..........

Not saying anything other than, "hey, looka that. thats weird."
 
Top