Explain this about threads

Jon Slaughter · Sep 23, 2007

Peter Duniho said:
Have you tried? The basic parallel port driver should be protocol
agnostic, AFAIK. You open it with CreateFile(), and it's just a
read/write stream.

The driver should take care of all the data integrity stuff, while your
application can worry about the application protocol.

I don't know how to do this using managed code, but I think that using
p/invoke to get at the unmanaged API would be a lot easier than the hoops
you're trying to jump through now.

Maybe it does, I haven't tried. I just used the kernel driver because thats
what everyone else doing what I'm doing uses.

In fact the dll that wraps the kernel uses create file to do this... so
essentially it is emulating what your speaking of. I suppose its for
performance reasons why its done that way.

Ensuring an average data rate is easy. You send some fixed amount of
data, and then you wait an appropriate amount of time. The amount of time
you wait and/or measure may be long, but there will always be an amount of
time you can select that provides an accurate-enough average data rate
even using the standard Windows timing mechanisms.

yes, but then how? Waiting an appropriate amount of time is much more
difficult if that time is very short.

Whether this is practical in your case, I can't say. It really depends on
why the timing is so critical to you. But generally speaking, it's just
not a problem. Average is just that: average. If you average over a long
enough time period, it's trivial to achieve any arbitrary average. You
just need to be able to select a long enough time period.

The timing isn't critical except that I don't want it to take 5 years to
send 9 bits. Faster is better because it increases productivity and
decreases frustration.

[...] What I'm trying to achieve is a way to send data at an average rate
so this can be quantitatively adjusted by the user. So if the user wants
20khz he will get about 20khz and not 50khz or 5khz(on average).

Click to expand...

Well, over what time period is it necessary that this average data rate be
achieved? Will the user care if the data transmission is in short, rapid
bursts that over a second or more still average out to the rate they've
selected?

No, it doesn't matter as its an average and suppose to be on such a small
time scale that the user can't comprehend it anyways(so they will jsut see
the average)

Or is the requirement more sensitive than that?

[...]
Sure... but if you can come up with a better way then I'm all ears. This
is the method used by all the programs that I am aware that are doing
similar stuff to what I need.

Click to expand...

For example? What Windows programs are you talking about that use this
library to access the CPU's i/o port directly, rather than going through
CreateFile to access the parallel port? How have you verified that they
use this technique? And if they use this technique, how to _they_ deal
with the data throttling issues?

Yes I have. Look up WinPic, InpOut32, PortTalk, GiveIO, etc... Anything to
do with embedded system communication using hte parallel port almost surely
uses the moethod I'm doing.

They just bit bang and hope for the best and it works... They don't care
about blocking other threads or long delays or anything because it usually
works... and if it doesn't they you insert a larger delay.

A simple example can be found

http://www.codeproject.com/csharp/csppleds.asp

I'll be working with pic's instead of led's(which are easy and require no
protocol) and lcd's.

Knowing answers to those questions would go a long way to better
understanding your specific situation.

As far as I know, a parallel printer driver does not implement its own
parallel port i/o. It uses the built-in Windows parallel port driver, and
relies on that driver to deal with the low-level issues. This is true
even for the older parallel port modes (often selectable in BIOS, so
Windows has to support them).

If it were me, I would use unmanaged code based on CreateFile using the
parallel port as my first attempt. Only if that didn't allow me to
achieve what I wanted would I mess around with this lower-level stuff.

Well, it might work. I don't see it being any different than using the
kernel mode driver I'm using since I think for all practical purposes it
does the same except probably more efficient(since its built for this).

The problem I do see though is that in general the parallel port uses some
handshaking methods that I would end up having to emulate to get it to work.
This is completely unnecessary for what I'm doing and causes more problems
than its worth.

Thanks,
Jon

Jon Slaughter · Sep 23, 2007

A simple example can be found

http://www.codeproject.com/csharp/csppleds.asp

I'll be working with pic's instead of led's(which are easy and require no
protocol) and lcd's.

BTW, the second part is the lcd,

http://www.codeproject.com/csharp/cspplcds.asp

as you can see, the code for sending a command to the lcd is

/* Makes the enable pin high and register pin low */
PortAccess.Output(control, 8); Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* Makes the enable pin low for LCD to read its
* data pins and also register pin low */
PortAccess.Output(control, 9); Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* Clears entire display and sets DDRAM address
* 0 in address counter */
PortAccess.Output(data, 1); //Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* Makes the enable pin high and register pin low */
PortAccess.Output(control, 8); Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* Makes the enable pin low for LCD to read its
* data pins and also register pin low */
PortAccess.Output(control, 9); Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* We are setting the interface data length to 8 bits
* with selecting 2-line display and 5 x 7-dot character font.
* Lets turn the display on so we have to send */
PortAccess.Output(data, 56); //Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* Makes the enable pin high and register pin low */
PortAccess.Output(control, 8); Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* Makes the enable pin low for LCD to read its
* data pins and also register pin low */
PortAccess.Output(control, 9); Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article
}Which is just outping and delaying. This command takes atleast 10ms to
complete. This is no big deal for an lcd but huge for programming a
pic.Using smaller delays results in better performance but requires
spinwaits because there is no way to block for less than 1ms.Also, doesn't
matter if the sleep waits are larger than 1ms because that makes it even
slower... I mean that it doesn't matter to the lcd(most likely unless it has
some timeout circuitry)Jon

Peter Duniho · Sep 23, 2007

Jon said:
[...]

If _all_ you really want to do specifically is block, then I'd say that
Sleep() would be the function that does that. So in that respect,
sure...that's the "special API routine" that blocks. But there's many
many other ways for a thread to block.

Click to expand...

Well, sleep would be a perfect solution for me if I could do it for less
than 1ms. 1ms means that if I interlaced every output with sleep(1) that I
could run at a maximum of 1khz. This is extremely slow for my application.
if I could get 10us resolution then it would be much better. I know that
this would be counter productive because it would require a task switch
every 10us which is very costly.

Indeed. The task switch itself could use a significant amount of time
in that case.

I should point out that even though Sleep() takes 1 millisecond units,
you can't actually get 1 millisecond resolution from it. I think you've
already discovered this on your own, but it's worth reiterating. Once
you yield via Sleep(), all other threads at the same priority will get a
chance to run, if if no other threads at your same priority are
runnable, then threads of a lower priority will get to run.

The bottom line being that if there's even one thread ready to run, it
will run and potentially use its entire timeslice, which will be a lot
longer than 1 millisecond, and your thread won't get to run again until
at least that thread is done (and if there are multiple threads at the
same priority, then they _all_ get a shot at the CPU before your thread
gets to run again).

That's a long way of saying that when you call Sleep() even with the
minimal value of 1 millisecond, it could be a lot longer than that
before the call returns control to your thread.

It seems there are 3 parts here.

who blocks, who unblocks, and what triggers the unblock.

I suppose the first one doesn't matter because in either case the thread
will get blocked. The second one cannot be the thread because it makes no
sense to unblock itself because then it would have to be running.. So this
must be the controller of the resource because the scheduler has no idea
about the resource.

So really I think the problem is the last one. I imagine that the the
controller of the resource(well, the code that interfaces with it) says
"Hey, The resources is available" and tells the scheduler that it can now
unblock the thread. Just not sure how that works. Maybe the details don't
matter to much though cause its kinda getting off tangent to my original
problem.

I do agree that the specifics may not be important here. Though, you
mentioned the possibility of writing your own embedded OS. If you do
that and it has any sort of thread scheduling, you'll want to understand
this stuff _very_ well.

For what it's worth, I think the way I like to look at it most of the
time is that the threads are moving themselves and other threads between
the "runnable" and "unrunnable" lists that the scheduler maintains. A
thread moves itself to the "unrunnable" list when it does something that
blocks. Then other threads do things that implicitly move unrunnable
threads back to the runnable list.

For example, one thread may call WaitHandle.WaitOne(), which moves that
thread to the unrunnable list. It remains on that list until some other
thread calls WaitHandle.Set() on the same WaitHandle instance.

The WaitHandle object itself includes some OS glue that actually handles
this. So the thread calling Set() doesn't itself really know anything
about what's going on. But it is in fact effectively causing the other
thread to be moved back to the runnable list.

All of this happens independently of the scheduler. All the scheduler
does is wake up when one of two things happens: a thread yields, or the
timeslice expires. It checks to see what thread gets the CPU next based
on the priorities of the threads in the runnable list, manages the
switch to that thread, and then lets that thread continue executing,
itself going back to sleep until it's needed again.

This is necessarily a bit of an oversimplification. There's lots of
other stuff in there to deal with multiple threads waiting on the same
resource, to prevent absolute thread starvation, etc. But that's the
basic idea.

Yeah, I guess I need to read about these in the context of an operation
system. I know why they are used but I guess I'm not clear on exactly how
they are implemented... of course its probably not such a simple topic.

As is the case with many things, the basic idea really is actually
pretty simple. There are complicating factors to ensure that things
work right in all situations, but the fundamentals of the implementation
are fairly straightforward IMHO.

Yes... but ultimately it doesn't matter because I have to spinwait to get
the timing right... So if I write a kernel mode driver then I have to
spinwait in there and sure I end up blocking the calling thread but also
everything else... but I don't seem to have any choice because I have to
time things but pc's don't have the ability to do such high resolution
timing(AFAIK).

PC's can easily deal with the timing. But Windows does not. This isn't
a hardware limitation, it's a basic issue of the implementation of this
particular multi-tasking OS. Granted, the implementation is not a whole
lot different from other multi-tasking OS's, but in each case the
limitation is fundamental to the OS, not the hardware itself.

[...]
So I guess when I call sleep I put a block out but whateve mechanism is
behind sleep(the os I guess) has the code that will unblock my code after
the time has elapsed?

Yes. The OS is keeping track of the time the thread wants to sleep, and
when that's expired it knows to make the thread to be runnable again.

[...]

Define "controller of the resource". In some respects, the OS is the
"controller". That is, ultimately it is the OS managing who gets the
resource at any given time. On the other hand, you could say that
whatever thread currently has acquired the resource is the "controller".
That is, until that thread releases the resource, the resource is
controlled by that thread and unavailable to any other thread.

Click to expand...

I mean the lowest level code that works with the resource like a driver.

Well, keep in mind that the "resource" may not be very low-level. For
example, synchronization objects are fairly high-level things, IMHO.
Certainly nothing like hardware interrupts that drivers might have to
deal with, for example.

But yes, in the case of a driver handling i/o, for example, you can
certainly view the driver as the "controller of the resource".

[...]
Ok. I guess I would need to see exactly how its implmented to feel
comfortable with it but I do see how it could work. I guess its just hard
for me to see what kinda code has to be behind such seemingly simple calls
as readline to get all that stuff to work.

The main thing that makes it complicated is that the same mechanisms are
used for a variety of situations. If all you had to do was implement
Console.ReadLine(), that would be easy. But because of the way an OS is
layered, to build specific functionality on top of more basic
functionality, until you get to the highest levels, it winds up seeming
fairly complicated.

Even at that though, you might be surprised at how little actual code
really needs to execute in order to manage a thread's call to
ReadLine().

I guess though its very similar how dos works? If you wanted to output text
you would call an interrupt routine(although you could do write directly to
video memory I suppose)... the act of interrupting is blocking and the
unblocking is when the interrupt returned from the call?

Sounds like this is very similar but on a more complicated level because of
the multitasking environment.

IMHO, because of the multitasking environment, it's completely different
from the way DOS works. And yet, I suppose in another way it's exactly
the same.

First, when referencing DOS I think it's important to differentiate
between interrupts that are basically just an OS API and interrupts that
actually do interrupt things.

The latter is a lot like the thread scheduler preempting one thread and
scheduling a different one to run. The former is not a lot different
from calling a Windows OS API. The main difference being that Windows
being a multi-tasking OS, some calls to the API cause your thread to
block and wait on some other thread.

The way in which it's exactly the same is easiest to understand on a
single-processor PC. That is, in the end the processor can only execute
one instruction at a time. So while Windows has the concept of a thread
scheduler and it does in fact rely on the low-level hardware interrupts
to work correctly, if you have a bunch of threads that never can finish
their timeslice without needing to block, it doesn't really work a lot
different than some DOS application that uses some sort of cooperative
round-robin scheduling.

For example, there were those TSR programs you could use in DOS that all
hooked to the same interrupts, and chained themselves. They all got a
shot at the CPU, by cooperating with each other and doing a little bit
of work at a time and then calling the next one in the chain.

The thread scheduler in Windows does this in a much more formalized way,
but the end result is the same: execution jumps from piece of code to
piece of code, letting each piece do a little bit of work at a time.

[...]
But I can still use the port to do the communications because its just
simply sending bits in a predefined way. Its clocked/synchronous
communications so there is a clock and a data line. I can use two pins on
the port and just output bits in the correct order to do my communications.

Problem is, in some cases the protocol specifies that the slave will need to
take over the communications(such as the acknowledgement part). So maybe I
can do something like

out 1
out 0
out 1
out 1

but then I need to wait for an acknowledgement from the slave.

How can I do this without polling or interrupts?

Is the slave sends the acknowledgement and I'm not listening then I've lost
it. If I can't use interrupts then only other option is to poll. (and when
it sends th acknowledgement it only lasts for a several microseconds)

From your description, perhaps you do need to poll.

However, given that it's likely that your thread will spend at least as
much time not executing as it does executing, it seems like you are
still engaged in risky business.

If you use a technique to ensure that you don't start sending data until
right at the beginning of a timeslice, and you know for sure that you
can complete an entire transaction within a timeslice, then I suppose it
would work fine.

But otherwise, you run the risk of having that acknowledgment sent when
your thread doesn't have the CPU.

[...]

If you have no managed access to the i/o from the hardware, and no
integrated unmanaged access to the i/o (that is, via one of the
higher-level i/o API in Windows), then I suppose it's possible polling is
your only option. However, even there what you should do is use a call to
Sleep(), with a 1 millisecond timeout, any time you poll and don't have
any work to do, to ensure that your thread immediately yields its
timeslice.

Click to expand...

To slow ;/ I would love to use sleep as its the easiest method but its just
to slow. 1ms resolution would just kill my performance.

Well, just so I'm clear: I'm not suggesting calling Sleep(1) in between
each byte you sent. I'm suggesting that you do as much processing as
you need to at the moment, and then yield using Sleep(1).

If you can't just execute the calls to output the data to the port as
fast as possible, then you may still have to do the spin-wait thing to
slow your code down. And you may have to send and then wait for the
acknowledgment before yielding. But you can at least do all that stuff,
assuming you know it will happen fast enough to complete in a single
timeslice.

For one of the protocols each command is about 30 bits long and there are
minimum wait times of about 40ns between each bit sent. This means I would
have to wait ~30ms to send just one command.

If you yielded between each bit, it would be worse than that, because
calling Sleep(1) isn't going to block for just 1 millisecond. It can
easily block for tens of milliseconds. A 30 bit command could take a
second or more to complete.

But I'm not suggesting that. I hope the above clarification explains it
better.

[...]
I think the problem is, is that blocking does nothing for my app because its
not a "timed" block. I'll end up blocking the thread anyways but I'll do it
at the command level.

so essentially instead of inserting sleep between every bit, I'll insert
short spinwaits and then unblock after the full command is sent.

I assume you mean "then block after the full command is sent". And yes,
I think that would be a fine way to do it, assuming you really have to
do the spinwait thing.

It would be ideal if you could just use the built-in parallel port
driver, but you're saying you can't and I don't have any knowledge that
would contradict that. If you can't use the built-in i/o API, then I'd
agree there's not a good thread-friendly way to do what you want.

[...]

Fortunately, for practically all i/o that a Windows application might be
asked to do, the OS switches between threads quickly enough that the
end-user never will notice any difference.

Click to expand...

Well, thats a problem though because windows has the benefit of
multi-tasking... which makes it much better for dos but falls short for
timed communications. Would be nice if they added some ability for both.

To some extent, this is addressed by most hardware devices. That is,
they provide hardware level buffering and flow control, allowing the
software to be a little more, well...soft about timing.

I mean, what would be really nice is to have a small device dededicated for
doing such a thing. Actually they are all over the pc but they are all
hardware and not programmable.

It seems to me that the device that's the problem here is whatever
you're connecting to the PC. The parallel port on the PC definitely can
do buffering and flow control, but apparently the devices you're
communicating with don't support that. The real issue is the hardware,
not the lack of support on the PC.

Another thing I would explore, assuming you've already verified that the
unmanaged Windows parallel port i/o technique won't work, is whether
there is a hardware device that will conform to whatever protocol these
devices use, but which can on the other end do the necessary buffering
and flow control so that the PC can access it via the higher-level
hardware protocols.

It seems silly to me in this day and age that hardware exists that is so
primitive, and at the very least there ought to be a good hardware
solution to allow more convenient and modern software solutions for
accessing the device.

[...]

Well, IMHO that's known as "DMA".

Click to expand...

Well, but its then a "dumb" piece of a hardware. What I mean is something
that sorta runs a sperate piece of code at a specific timed rate... like a
small thread but that is not part of the main cpu... maybe just 1 register
and a little memory for the stack and buffers that are also not part of the
main memory.

I think what you mean is something programmable. A DMA controller isn't
all that "dumb"; it has its own internal execution flow, it's just not
something you can program directly.

My main point is actually just that there are parts of hardware on the
computer that deal with i/o without the handholding your parallel
interface devices apparently need. That's probably not useful to you,
but they do exist.

[...]

You seem to be asking about the other direction; software being too fast
for the hardware. But that begs the question, why is the hardware driver
not taking care of this already. I seem to recall this was a regular
parallel port; am I misremembering? A standard parallel port driver
should handle all of the buffering you need for it to work correctly with
whatever parallel device is attached to the hardware.

Click to expand...

Maybe it does? I didn't know the parallel port buffered anything in SPP or
if it does its a vey small buffer, But even if it did then it might run to
fast?

Could be. I don't know. With less primitive hardware, it wouldn't be a
problem. But if your hardware has no way to indicate to the parallel
port hardware on the PC that it's going too fast, I guess you have to
manage this explicitly.

[...]
Those things are out of my control. If windows has to interrupt my thread
when I'm half sending a command then I can't do anything about it... it just
slows down data rate... this is one reason why the speed is "critical". (not
that its critical in the sense that it has to be fast to work but it has to
fast to work well and in this case = more productivity)

I guess one thing I still don't understand is, if it's theoretically
possible for your code to be interrupted, and all that doing so causes
is the data rate to slow down, how can it still be possible that you
have to poll at a high rate to catch acknowledgments. What happens if
you complete sending the command just before your thread gets preempted,
and then while the thread is suspended by the scheduler the
acknowledgment arrives?

[...]
Maybe I should write up specifically what I'm trying to do in another thread
so its more clear?

Probably. I agree that this thread seems to be pretty much wrung out,
as the basic thread blocking questions you were asking seem to have been
answered.

Frankly though, I think your application is pretty far afield for this
newsgroup altogether. The kinds of things you're doing all involve
using various unmanaged techniques. Your program might be a C# .NET
program, but none of the stuff you're asking about is really all that
much related to C# or .NET.

So while starting a new thread more specific to your needs probably
makes more sense than continuing this thread, I think in reality you'd
probably get much better answers if you found a newsgroup where people
are regularly doing things more like what you're doing. You're
certainly outside my main areas of expertise, and I've even done a
little bit of driver, low-level i/o code. There are a handful of other
people who post here regularly that I think have the sort of experience
that you'd find useful (for sure they know a lot more about this stuff
than I do), but there's probably at least one newsgroup where there are
dozens, if not hundreds, of people with that kind of experience.

You'll probably get much better advice on this particular topic in such
a newsgroup.

Pete

Willy Denoyette [MVP] · Sep 23, 2007

Jon Slaughter said:
Willy Denoyette said:

Jon Slaughter said:

"Instead of just waiting for its time slice to expire, a thread can
block each time it initiates a time-consuming activity in another
thread until the activity finishes. This is better than spinning in a
polling loop waiting for completion because it allows other threads to
run sooner than they would if the system had to rely solely on
expiration of a time slice to turn its attention to some other
thread."

I don't get the "a thread can block each time...". What does it mean
by blocking? Does it mean that if thread B needs something from thread
A that thread A stops thread B from running until its finished but not
interfer with some other thread C?

Thanks,

Jon

Adding to what others have said in this thread:
1) You should never SpinWait on a single processor machine, doing so
prevents other threads in the system to make progress( unless this is
exactly what you are looking for).
Waiting for an event (whatever) from another thread in a SpinWait loop,
prevents the other thread to signal the event, so basically you are
wasting CPU cycles for nothing.
2) Define the count as such that you spin for less than the time needed
to perform a transition to the kernel and back, when waiting for an
event. Spinning for a longer period is just a waste of CPU cycles, you
better give up your quantum by calling Sleep(1) or PInvoke the
Kernel32 "SwitchToThread" API in that case.

Ok, Tell me then how I can do clocked IO in a timely fashion without
using without spining?

Lets suppose I have do slow down the rate because it simply to fast for
whatever device I'm communicating with if I do not insert delays.

I would be really interested in knowing how I can do this without using
spinwaits because it is a problem that I'm having. It seems like its a
necessary evil in my case.

Basically I'm trying to do synchronous communication with the parallel
port. I have the ability to use in and out which is supplied by a kernel
mode driver and dll wrapper. So if I want to output some data to the
port I can do "out(data)" and it will output it

if I have something like

for(int i = 0; i < data.Length; i++)
out(data);

Then this will run about 100khz or so(on my machine). Now what if I
need to slow it down to 20khz? How can I do this without using spin
waits but still do it in a timely fashion? IGNORE ANY DELAYS FROM TASK
SWITCHING! I cannot control the delays that other processes and task
switching introduce so since its beyond my control I have to ignore it.
Whats important is the upper bound that I can get and the average and
not the lower bound. So when I say it needs to run at 20khz it means as
an upper bound.

for(int i = 0; i < data.Length; i++)
{
out(data);
Thread.SpinWait(X);
}

Where X is something that slows this down enough to run at 20khz. I can
figure out X on average by doing some profiling. i.e., if I know how
long out takes and how long Thread.SpinWait(1) takes(on average) then I
can get an approximate value for X.

But how can I do this without using spin waits?

Click to expand...

The data transfer rate on a parallel port is a matter of handshake
protocol between the port and the device, basically it's the device who
decides the (maximum) rate. The exact transfer rates are defined in the
IEE1284 protocol standards (IEE1284.1, 2, 3 , 4..) and the modes (like
Compatible, Nibble, Byte and ECP mode) supported by the parallel port
peripheral controller chips. All these kind of protocols (par. port ,
serial ports networks, USB, other peripheral protocols) are exactly
invented to be able to control the signaling rates between the system and
the device, the PC hardware and the Windows OS is simply not designed for
this, they are not real-time capable.

Click to expand...

No, this is only for ECP and EPP. There is no handshaking and hardware
protocol in SPP which is what I'm using. It is also necessary for me to
use SPP because the device that is attached does not use the same protocol
that EPP/ECP uses.

Now, if you don't have a device connected that negotiates or respects one
of the IEE1284 protocol modes, you have a problem. You can't accurately
time the IO transfer rate, all you can do is insert waits in your code
user mode or in a driver driver) and as such define a top level rate but
no lower rate!
The easiest way (but still a dirty way to do) is by inserting delays like
you do in your code, this is not a problem for small bursts (say a few
100 bytes) on a single processor box, and a few KB on multi-cores,
assuming that you don't further peg the CPU between each burst, so that
other threads don't starve.

Click to expand...

Well, thats what I'm doing but I'm trying to find the optimal method. This
is also the method that most programs that do similar things I'm trying to
do use.

I think I'm going to write a simple kernel mode driver that does all the
communications using direct port access(instead of the IOCTRL methodolgy).
Its more of a hack but is probably fast as I can get it. Of course that
method will cause problems with other drivers and stuff but I don't have
to worry about that.

I can also use the interrupt to get information on a regular basis but not
sure how well this will work.

I was thinking that maybe I could use an interrupt and then an external
clock that will trigger the interrupt very precisely and that would
probably give me a pretty accurate method but it would probably starve the
system because of all the task switching per clock. I guess I have no
choice but to either use something like dos or some hardware proxy that
can deal with the latency issues.

SPP uses the status lines (Ack, Busy ... signals) to control the data flow
between device and controller, the driver has to read the status of the Ack
signal after each byte transferred in order to control the signaling rate.
Writing your own driver won't help you any further as long as you don't use
the handshaking protocol as defined in 1284 at the device driver level. Sure
you can move the SpinWait loop at the driver level, but this is no different
as doing this at the user level, you are burning CPU cycles without any more
guarantees that you wont loosing bytes because the device is not ready to
accept any more data. I would keep these things in user space anyway, it's
simply a matter of trying to find the optimal value for the spin counter.

Willy.

Rick Lones · Sep 24, 2007

Jon said:
Ok, these are locks... but who implements them? I can't see how a program
can block itself to task scheduling... except maybe it blocks but then
something else has to unblock.

Whoever is responsible for managing the resource at hand "owns" the
sychroniztion mechanism, often enough this is the OS or some major subsystem of
same. You are correct that the unblocking is done by "something else".
Typically the something else is a driver, an interrupt handler, or some external
task. Think Producer/Consumer: the Consumer blocks (via Monitor.Wait()) until
the Producer makes available whatever the Consumer is needing. It is the
Producer who must wake up the sleeping Consumer via Monitor.Pulse(). (I highly
recommend Jon Skeet's very lucid explanation of this mechanism.)

Ok, so your saying essentially when a synchronous call is made that requires
a resource that you block yourself? Somehow you say "I'll wait until the
resource is free"... but then something else must unblock. But it would
seem then to do any type of blocking you have to know explicitly the
resource to block on so whatever else can unblock. Seems more complicated
than just having the other thing block and unblock.

What I mean is that it makes more sense to me for whatever that is
controlling the resource to control blocking and unblocking.

Example.

Program:
Hey, Save this file

Resource Handler:
(Internally to scheduler: Block Program,
Save the file,
Return status msg and unblock program)
Ok, I saved it.
------

instead of

Program:
Hey, Save this file
(tell scheduler to block me)

Resource Handler:
(Internally to scheduler:
Save the file,
Return status msg and unblock program)
Ok, I saved it.

I guess though it doens't quite matter where it does it and maybe its
actually better for the program to block itself.. just seems like theres no
context there and it could block itself for any reason even if the resource
handler doesn't need to block.(which I guess would be asynchronous commm)

Hmm, I don't want to get too deep into the details of your pseudocode here. You
are doing a lot of rather sophisticated guesswork but your overall model is
maybe a little skewed. One key piece of the picture is that a correctly
implemented synchronization method does not block arbitrarily. If my program
issues ReadLine() and there is already a CRLF in the input buffer the call will
return all bytes up to and including CRLF without blocking - the requested
"resource" was available in this case and the result is just a normal subroutine
call and return sequence. But if I issue ReadLine() and there is no CRLF to be
read I must wait until there is - so the ReadLIne() routine will follow a
differenct path that eventually calls an OS routine which one way or another
causes suspension of my task. Later some other task or OS component will notice
that there is now a CRLF available at the console and so I can be woken up and
allowed to proceed. Note that from the program's point of view it still looks
like the same call/return sequence - the intervening mechanism of waiting for
the resource is effectively transparent unless your program is time-sensitive.

I think eventually I will take a look at some embedded operating systems
because I'll probably need one in the future but at this point I just want
to write a program to compute with some devices I have that use some
protocols(ICSP, I2C and SPI). I want it ot be general enough so I program
these protocols in a nice way(instead of hard coding them). That way if the
future if I want to add another one such as modbus or rs-232(emulated on the
parallel port by polling... or just end up using the serial port) I can
without to much trouble.

I don't know what ICSP is, but I would be surprised if you can do either I2C or
SPI from DotNet. Or maybe you are running an embedded version on a micro?
Modbus (at least Modbus-ASCII) on the other hand could be done in managed code
on a PC running any version of Windows.

-rick-

Jon Slaughter · Sep 24, 2007

Peter Duniho said:
Jon said:

[...]

Click to expand...

Indeed. The task switch itself could use a significant amount of time in
that case.

I should point out that even though Sleep() takes 1 millisecond units, you
can't actually get 1 millisecond resolution from it. I think you've
already discovered this on your own, but it's worth reiterating. Once you
yield via Sleep(), all other threads at the same priority will get a
chance to run, if if no other threads at your same priority are runnable,
then threads of a lower priority will get to run.

The bottom line being that if there's even one thread ready to run, it
will run and potentially use its entire timeslice, which will be a lot
longer than 1 millisecond, and your thread won't get to run again until at
least that thread is done (and if there are multiple threads at the same
priority, then they _all_ get a shot at the CPU before your thread gets to
run again).

That's a long way of saying that when you call Sleep() even with the
minimal value of 1 millisecond, it could be a lot longer than that before
the call returns control to your thread.

Well, thats specific cases. As long as the average behavior is fast enough
then its ok. (of course I would still like to minimize the latency but that
means I just have to close down as many other processes as possible)

I do agree that the specifics may not be important here. Though, you
mentioned the possibility of writing your own embedded OS. If you do that
and it has any sort of thread scheduling, you'll want to understand this
stuff _very_ well.

Yes, I know. But that project will be a long way off and I'll get some books
on it when I need it. I can only do so much with what little time I have ;/

For what it's worth, I think the way I like to look at it most of the time
is that the threads are moving themselves and other threads between the
"runnable" and "unrunnable" lists that the scheduler maintains. A thread
moves itself to the "unrunnable" list when it does something that blocks.
Then other threads do things that implicitly move unrunnable threads back
to the runnable list.

Yeah. I guess my problem is that when I did most of my heavy programming it
was back in the days of dos and there was never really any threading to
worry about. I don't ever recal even working with threads except if you
consider interrupts and timers threads.

As is the case with many things, the basic idea really is actually pretty
simple. There are complicating factors to ensure that things work right
in all situations, but the fundamentals of the implementation are fairly
straightforward IMHO.

Of course

But most of those complicated things are much easier to learn
by experience within the actual problem instead of some abstracted
sense(atleast for me).

PC's can easily deal with the timing. But Windows does not. This isn't a
hardware limitation, it's a basic issue of the implementation of this
particular multi-tasking OS. Granted, the implementation is not a whole
lot different from other multi-tasking OS's, but in each case the
limitation is fundamental to the OS, not the hardware itself.

Yes, I meant under windows. It is quite easy to do very high resolution
timing on a 3ghz pc(except maybe for some of the optimizations the pc does
which could cause some problems in theory).

[...]
But I can still use the port to do the communications because its just
simply sending bits in a predefined way. Its clocked/synchronous
communications so there is a clock and a data line. I can use two pins
on the port and just output bits in the correct order to do my
communications.

Problem is, in some cases the protocol specifies that the slave will need
to take over the communications(such as the acknowledgement part). So
maybe I can do something like

out 1
out 0
out 1
out 1

but then I need to wait for an acknowledgement from the slave.

How can I do this without polling or interrupts?

Is the slave sends the acknowledgement and I'm not listening then I've
lost it. If I can't use interrupts then only other option is to poll.
(and when it sends th acknowledgement it only lasts for a several
microseconds)

Click to expand...

From your description, perhaps you do need to poll.

However, given that it's likely that your thread will spend at least as
much time not executing as it does executing, it seems like you are still
engaged in risky business.

If you use a technique to ensure that you don't start sending data until
right at the beginning of a timeslice, and you know for sure that you can
complete an entire transaction within a timeslice, then I suppose it would
work fine.

But otherwise, you run the risk of having that acknowledgment sent when
your thread doesn't have the CPU.

Thats true. I have to read up more on the specs to see how the
aknowledgement works. I'm hoping that its clocked in so the slave actually
synchronizes the ack with the clock so then its no big deal. Else I'll have
to figure out how to do it. I know there are some issues such as where the
slave takes control, such as if it runs into an error it forces the clock
line to go low and the master has to realize this(well, it actually doesn't
matter but it should be aware).

For simple ICSP I don't think there is anything like this but for I2C it is
much more complicated, in general, and maybe not doable with standard
windows. So either I'll have to implemented it in hardware(Which is easy
since PIC's support I2C inherently) or figure out something else.

[...]

If you have no managed access to the i/o from the hardware, and no
integrated unmanaged access to the i/o (that is, via one of the
higher-level i/o API in Windows), then I suppose it's possible polling
is your only option. However, even there what you should do is use a
call to Sleep(), with a 1 millisecond timeout, any time you poll and
don't have any work to do, to ensure that your thread immediately yields
its timeslice.

Click to expand...

To slow ;/ I would love to use sleep as its the easiest method but its
just to slow. 1ms resolution would just kill my performance.

Click to expand...

Well, just so I'm clear: I'm not suggesting calling Sleep(1) in between
each byte you sent. I'm suggesting that you do as much processing as you
need to at the moment, and then yield using Sleep(1).

If you can't just execute the calls to output the data to the port as fast
as possible, then you may still have to do the spin-wait thing to slow
your code down. And you may have to send and then wait for the
acknowledgment before yielding. But you can at least do all that stuff,
assuming you know it will happen fast enough to complete in a single
timeslice.

We'll see. I'll do some tests and see what I can get away with. I'm not
trying to make this super duper application that does what it does perfectly
and also plays fair with everything else. It just needs to work well enough
to do its job even if in some cases it doesn't.

If you yielded between each bit, it would be worse than that, because
calling Sleep(1) isn't going to block for just 1 millisecond. It can
easily block for tens of milliseconds. A 30 bit command could take a
second or more to complete.

But I'm not suggesting that. I hope the above clarification explains it
better.

Yeah. I know. I was just making an estimation. My tests show that its
actually pretty good times on average. Maybe just my computer though. I tend
to run in a minimized environment(most drivers and services are turned off
except for necessary ones and a few others). I don't run any crap like most
people except msn.

[...]
I think the problem is, is that blocking does nothing for my app because
its not a "timed" block. I'll end up blocking the thread anyways but
I'll do it at the command level.

so essentially instead of inserting sleep between every bit, I'll insert
short spinwaits and then unblock after the full command is sent.

Click to expand...

I assume you mean "then block after the full command is sent". And yes, I
think that would be a fine way to do it, assuming you really have to do
the spinwait thing.

I might not have to spin wait or just spin wait for a very little bit of
time depend on how long the port call takes

It would be ideal if you could just use the built-in parallel port driver,
but you're saying you can't and I don't have any knowledge that would
contradict that. If you can't use the built-in i/o API, then I'd agree
there's not a good thread-friendly way to do what you want.

[...]

Fortunately, for practically all i/o that a Windows application might be
asked to do, the OS switches between threads quickly enough that the
end-user never will notice any difference.

Click to expand...

Well, thats a problem though because windows has the benefit of
multi-tasking... which makes it much better for dos but falls short for
timed communications. Would be nice if they added some ability for
both.

Click to expand...

To some extent, this is addressed by most hardware devices. That is, they
provide hardware level buffering and flow control, allowing the software
to be a little more, well...soft about timing.

yeah, unfortunately I don't have that ;/

It seems to me that the device that's the problem here is whatever you're
connecting to the PC. The parallel port on the PC definitely can do
buffering and flow control, but apparently the devices you're
communicating with don't support that. The real issue is the hardware,
not the lack of support on the PC.

They devices do support usart and stuff but not for programming. The
programming method is different and done the way they do it for what ever
reason. Its actually very simple to do and it would be overkill to have to
use some more advanced protocol.

Another thing I would explore, assuming you've already verified that the
unmanaged Windows parallel port i/o technique won't work, is whether there
is a hardware device that will conform to whatever protocol these devices
use, but which can on the other end do the necessary buffering and flow
control so that the PC can access it via the higher-level hardware
protocols.

The PIC's will work... but its not communication really but programming. And
all embedded micro processors use very similar programming methods.

The problem is that to be able to use some higher level hardware protocols
either requires more expensive hardware or that it be done in software...
but if your programming the chip then there is no software... so it has to
be done somehow in the hardware.

What they do now is use USB to send the program to a proxy device which then
programs the next device. But they had to program the proxy device by some
method that definitely wasn't USB.

It seems silly to me in this day and age that hardware exists that is so
primitive, and at the very least there ought to be a good hardware
solution to allow more convenient and modern software solutions for
accessing the device.

Yeah. I think so too. I guess theres just to much inertia involved in
integrating better solutions into such things.

I personally would like to see the whole idea of computers redesigned using
what we have learned instead of just building on top of all the mistakes.

Actually a personal dream of mine would be to get all the big hardware,
software companies, and universities to come together to sit down and design
a new computing system from scratch taking into account all things learned
in the past and what we need in a system for the future. Maybe a 10-20 year
project with both software and hardware guys communicating about there needs
and problems along with industries needs, etc...

Of course that will never happen but I can still dream

I just feel like theres a lot of dead wait that exist in computers that have
accumulated over the many years... not that it was a bad thing as most of it
was just sorta evolution... but problem is that we have learned and
implemented so little(outside of research). I suppose if these large
companies and universities could work together for a common goal and not to
make money then maybe something really great could come out of it. (I'd be
interested in a sorta operating system that takes into account artificial
intelligence and tries to increase productivity of the average user)

[...]

(Actually what would be nice is to have a small seperate cpu that is
specifically designed for timed communications so one could load some
code there and it will always run at some specified rate and is
independent of the main cpu and os)
Well, IMHO that's known as "DMA".

Click to expand...

Well, but its then a "dumb" piece of a hardware. What I mean is
something that sorta runs a sperate piece of code at a specific timed
rate... like a small thread but that is not part of the main cpu... maybe
just 1 register and a little memory for the stack and buffers that are
also not part of the main memory.

Click to expand...

I think what you mean is something programmable. A DMA controller isn't
all that "dumb"; it has its own internal execution flow, it's just not
something you can program directly.

I meant dumb as sorta like a dumb fire missle. Sorta shoot it and forget
it(cause it either hits or not). So you sorta program the DMA and it does
what it does and you cannot control that.

My main point is actually just that there are parts of hardware on the
computer that deal with i/o without the handholding your parallel
interface devices apparently need. That's probably not useful to you, but
they do exist.

Yeah ;/ They actually are useful but not for programming MCU's. They are to
high level for that because it requires some type of software logic(usually)
or special hardware to use to program. I'm sure its not impossible but the
devices just don't use that method ;/ I wish they did cause then I wouldn't
be wasting my time

[...]

You seem to be asking about the other direction; software being too fast
for the hardware. But that begs the question, why is the hardware
driver not taking care of this already. I seem to recall this was a
regular parallel port; am I misremembering? A standard parallel port
driver should handle all of the buffering you need for it to work
correctly with whatever parallel device is attached to the hardware.

Click to expand...

Maybe it does? I didn't know the parallel port buffered anything in SPP
or if it does its a vey small buffer, But even if it did then it might
run to fast?

Click to expand...

Could be. I don't know. With less primitive hardware, it wouldn't be a
problem. But if your hardware has no way to indicate to the parallel port
hardware on the PC that it's going too fast, I guess you have to manage
this explicitly.

[...]
Those things are out of my control. If windows has to interrupt my thread
when I'm half sending a command then I can't do anything about it... it
just slows down data rate... this is one reason why the speed is
"critical". (not that its critical in the sense that it has to be fast to
work but it has to fast to work well and in this case = more
productivity)

Click to expand...

I guess one thing I still don't understand is, if it's theoretically
possible for your code to be interrupted, and all that doing so causes is
the data rate to slow down, how can it still be possible that you have to
poll at a high rate to catch acknowledgments. What happens if you
complete sending the command just before your thread gets preempted, and
then while the thread is suspended by the scheduler the acknowledgment
arrives?

Then I loose it unless the ack is clocked too. (I still don't know because I
haven't read to much about I2C in that reguard. If its clocked then it
doesn't matter because its just when ever I can end the next clock pulse. If
its not clocked then I'll lose it if I can't get back time unless it the ack
lasts till some predetermined event. I'd imagine that for I2C though its not
a problem because it sorta runs in a multi-tasked environment... That is,
its a mutli-device protocol and there can be hundreds of devices in the
signal path. Any one of them could stop the communications for some reason.
(well, not to sure about that as the master only communicates with one
device at a time I believe)

[...]
Maybe I should write up specifically what I'm trying to do in another
thread so its more clear?

Click to expand...

Probably. I agree that this thread seems to be pretty much wrung out, as
the basic thread blocking questions you were asking seem to have been
answered.

Frankly though, I think your application is pretty far afield for this
newsgroup altogether. The kinds of things you're doing all involve using
various unmanaged techniques. Your program might be a C# .NET program,
but none of the stuff you're asking about is really all that much related
to C# or .NET.

So while starting a new thread more specific to your needs probably makes
more sense than continuing this thread, I think in reality you'd probably
get much better answers if you found a newsgroup where people are
regularly doing things more like what you're doing. You're certainly
outside my main areas of expertise, and I've even done a little bit of
driver, low-level i/o code. There are a handful of other people who post
here regularly that I think have the sort of experience that you'd find
useful (for sure they know a lot more about this stuff than I do), but
there's probably at least one newsgroup where there are dozens, if not
hundreds, of people with that kind of experience.

You'll probably get much better advice on this particular topic in such a
newsgroup.

Yeah, I kinda got off topic here about that. I started this stuff in C#
though and its lead me to messing with kernel mode stuff.

I think I'm just going to learn about kernel mode drivers for the hell of it
though

I'll try and write me something like inpout32 but maybe have a
little stuff for protocols or writing buffers. (so my C# program could send
a "command" in an array and the driver will do the rest)

I'm going to needt o read up on the protocols more though just to be sure
I'm not missing anything. I'm sure it can't be to hard though because there
are apps that do it.

Thanks,
Jon

Jon Slaughter · Sep 24, 2007

Rick Lones said:
Hmm, I don't want to get too deep into the details of your pseudocode
here. You are doing a lot of rather sophisticated guesswork but your
overall model is maybe a little skewed. One key piece of the picture is
that a correctly implemented synchronization method does not block
arbitrarily. If my program issues ReadLine() and there is already a CRLF
in the input buffer the call will return all bytes up to and including
CRLF without blocking - the requested "resource" was available in this
case and the result is just a normal subroutine call and return sequence.
But if I issue ReadLine() and there is no CRLF to be read I must wait
until there is - so the ReadLIne() routine will follow a differenct path
that eventually calls an OS routine which one way or another causes
suspension of my task. Later some other task or OS component will notice
that there is now a CRLF available at the console and so I can be woken up
and allowed to proceed. Note that from the program's point of view it
still looks like the same call/return sequence - the intervening mechanism
of waiting for the resource is effectively transparent unless your program
is time-sensitive.

I think this is essentially what I mean. I guess there is no real difference
who blocks the thread though because it will get blocked and it would just
be a different internal mechanism but probably have equivalent results.

I don't know what ICSP is, but I would be surprised if you can do either
I2C or SPI from DotNet. Or maybe you are running an embedded version on a
micro? Modbus (at least Modbus-ASCII) on the other hand could be done in
managed code on a PC running any version of Windows.

Well, I'd like to implement them all... but at this point the main one is
ICSP. The real difference is that ICSP is for programming MCU's while the
others are for communication. Since you are programming it requires more
data lines to control the mcu(such as power and a power on sequence but is
essentially clocked communications and very similar in its overall "look" to
any other clocked protocol such as I2C and SPI).

It is not a complicated protocol though. In fact its very simple. I just
want to use the best method I can to get the best results I can. I could
easily do this in C# using the kernel mode driver to do "indirect" access to
the port. But here I do not want to bit bang as it seems like a very
inelegant solution. (although for all practical purposes it works)

I'll see what happens though. I'm sure learning about kernel mode
programming can't hurt

Thanks,
Jon

Jon Slaughter · Sep 24, 2007

I can also use the interrupt to get information on a regular basis but

SPP uses the status lines (Ack, Busy ... signals) to control the data
flow between device and controller, the driver has to read the status of
the Ack signal after each byte transferred in order to control the
signaling rate.
Writing your own driver won't help you any further as long as you don't
use the handshaking protocol as defined in 1284 at the device driver
level. Sure you can move the SpinWait loop at the driver level, but this
is no different as doing this at the user level, you are burning CPU
cycles without any more guarantees that you wont loosing bytes because the
device is not ready to accept any more data. I would keep these things in
user space anyway, it's simply a matter of trying to find the optimal
value for the spin counter.

SPP doesn't use those lines at the hardware level(AFAIK). They are software.
For programming a PIC MCU(or even other MCU's) that protocol is not needed
but there is a similar protocol that can be implemented for the parallel
port. Essentially you enable one pin, disable another, then send bits out on
another pin. Its not complicated at all. Even the timing isn't complicated
except if its to fast then it does no good. To slow isn't good though and
thats really the main issue here. If I could sleep for microseconds then
this thread would have probably never been started. (assuming the
performance hit for all that task switching is irrelevant)

The reason I want to move most of the lower level stuff to a kernel mode
driver is that it should speed up the communications over all and this means
actually less spin waiting because I can send the information
faster(assuming I don't end up being limited to 10khz or something liek
that). Also all the task switching between the user mode app and the kernel
mode driver at the bit level seems like a waste of cycles. (Now if I end up
having to "consume" those in the spinwait in the kernel then it doesn't
matter)

I'll just have to do some tests and see which works the best. It maybe turn
out that the kernel mode offers no real benefit over doing it in user mode.

Thanks,
Jon

Willy Denoyette [MVP] · Sep 24, 2007

Jon Slaughter said:
SPP doesn't use those lines at the hardware level(AFAIK). They are
software. For programming a PIC MCU(or even other MCU's) that protocol is
not needed but there is a similar protocol that can be implemented for the
parallel port. Essentially you enable one pin, disable another, then send
bits out on another pin. Its not complicated at all. Even the timing
isn't complicated except if its to fast then it does no good. To slow
isn't good though and thats really the main issue here. If I could sleep
for microseconds then this thread would have probably never been started.
(assuming the performance hit for all that task switching is irrelevant)

The reason I want to move most of the lower level stuff to a kernel mode
driver is that it should speed up the communications over all and this
means actually less spin waiting because I can send the information
faster(assuming I don't end up being limited to 10khz or something liek
that). Also all the task switching between the user mode app and the
kernel mode driver at the bit level seems like a waste of cycles. (Now if
I end up having to "consume" those in the spinwait in the kernel then it
doesn't matter)

I'll just have to do some tests and see which works the best. It maybe
turn out that the kernel mode offers no real benefit over doing it in user
mode.

Thanks,
Jon

Please tell me what kind of parallel port driver do you use here? With
driver I mean the bus drive that is the lowest drive in the stack, and also
what OS you are running this on.
If you are running on W2K or higher, by default you are using the System
supplied parallel port *bus driver*, this driver knows several modes from
which the Compatible (Centronix) mode is selected by default, other modes
need to be selected by the upper filter driver. In this simple mode, the
"busy" signal is the handshake line used to indicate that the device is
ready to accept data from the parallel port host.
In it's most simple mode (compatibility) , the protocol works like this:
- the busy signal is checked by the PIC before he sets data on the data out
lines , the bus driver code needs to read the status register in order to
signal time-out whenever the busy condition persist for to long.
- if the device is not busy, the PIC sets the data on the data lines and
toggles the strobe line (done by the PIC)
- the device signals the acceptation of the data by toggling the Ack line
and eventually the busy line.
Now if the device doesn't toggle the busy line (which I can't believe), then
you are in trouble, as you can never know when the device is ready to accept
any data, and there is nothing you can do about this, delaying data output
is just a hack, you are never certain that the device is ready. You should
definitely look for a better solution, really.

Now I assume the device toggles that busy line, and I also assume that your
driver is a simple filter driver in top of the system supplied driver
(please correct me if I'm wrong), if this filter driver doesn't provide a
means to read the status register (via the IOCTL_PARIO_READ_PORT_STATUS IOCT
command) from the lower bus driver, then you are in trouble and you are back
at square zero.
Every IOCTL_PARIO_WRITE_PORT_DATA can overflow the output FIFO resulting in
loss of data. However, if, the filter driver provides a means to read the
status byte from the bus driver (IOCTL_PARIO_READ_PORT_STATUS ), then you
need to do it before you can write any data to the device
(IOCTL_PARIO_WRITE_PORT_DATA ). You could do this at the user level, but
this requires almost the same wait hack as you have now, much better is to
implement this in the filter driver. Implementing this is a simple matter of
buffering the data from the client (user space application) and block when
the buffer is (nearly) full. The filter driver can simply transfer the data
to the bus driver in a coordinated and well synchronized sense.

Willy.

Jon Slaughter · Sep 24, 2007

Willy Denoyette said:
Please tell me what kind of parallel port driver do you use here? With
driver I mean the bus drive that is the lowest drive in the stack, and
also what OS you are running this on.
If you are running on W2K or higher, by default you are using the System
supplied parallel port *bus driver*, this driver knows several modes from
which the Compatible (Centronix) mode is selected by default, other modes
need to be selected by the upper filter driver. In this simple mode, the
"busy" signal is the handshake line used to indicate that the device is
ready to accept data from the parallel port host.
In it's most simple mode (compatibility) , the protocol works like this:
- the busy signal is checked by the PIC before he sets data on the data
out lines , the bus driver code needs to read the status register in order
to signal time-out whenever the busy condition persist for to long.
- if the device is not busy, the PIC sets the data on the data lines and
toggles the strobe line (done by the PIC)
- the device signals the acceptation of the data by toggling the Ack line
and eventually the busy line.
Now if the device doesn't toggle the busy line (which I can't believe),
then you are in trouble, as you can never know when the device is ready to
accept any data, and there is nothing you can do about this, delaying data
output is just a hack, you are never certain that the device is ready. You
should definitely look for a better solution, really.

No, I do know because I can define my own protocol for which lines do which.
And for programming in ICSP the device is never busy.. hell, its not even
technically on yet... cause your programming it.

I think you do not understand whats going on and think that the parallel
port has to follow a certain protocol. This is not true in SPP mode. You
are at full control of what lines to do. There is really no such thing as a
busy line. Its just a pin on the port that is an open collector. You can use
it for whatever you want if. If you want to communicate using the "printer
protocol" then you'll need it for busy... else you can use it for sending
information(its not so good for receving because it sorta requires polling).

The driver I'm using is direct port access... there is no bus and no other
drivers. Its exactly the same as if it was done in dos. Check out InpOut32
for what I'm talking about. Or even PortTalk(which I don't use).

The parallel port in SPP is essentially 8 data lines that are configured for
output but also can be configured for input. 5 status lines that are
configured for input and 5 control lines that are mainly output but since
they are open collector can also be used for input.

The SPP communications protocol is software implemented deifnes what those
pins functions are.... it is not obligatory to use them. In ECP and EPP
modes it is different because much of the protocol is in hardware.

Now I assume the device toggles that busy line, and I also assume that
your driver is a simple filter driver in top of the system supplied driver
(please correct me if I'm wrong), if this filter driver doesn't provide a
means to read the status register (via the IOCTL_PARIO_READ_PORT_STATUS
IOCT command) from the lower bus driver, then you are in trouble and you
are back at square zero.
Every IOCTL_PARIO_WRITE_PORT_DATA can overflow the output FIFO resulting
in loss of data. However, if, the filter driver provides a means to read
the status byte from the bus driver (IOCTL_PARIO_READ_PORT_STATUS ), then
you need to do it before you can write any data to the device
(IOCTL_PARIO_WRITE_PORT_DATA ). You could do this at the user level, but
this requires almost the same wait hack as you have now, much better is to
implement this in the filter driver. Implementing this is a simple matter
of buffering the data from the client (user space application) and block
when the buffer is (nearly) full. The filter driver can simply transfer
the data to the bus driver in a coordinated and well synchronized sense.

No, your assuming to much. I think your getting SPP mode confused with the
other two which are in the hardware.

http://www.beyondlogic.org/spp/parallel.htm

"Centronics is an early standard for transferring data from a host to the
printer. The majority of printers use this handshake. This handshake is
normally implemented using a Standard Parallel Port under software control."

Ok, I do not know what "normally" means but on the pc it is in the software.

What this means is exactly what I said above in that you are in full control
of the "protocol". So I can use my own.

Also there is no filter drivers or anything. I'm in direct control of the
parallel port since I bypass all other drivers. Maybe there is some
virtualization going on, I do not know... but in the driver I can use either
READ_PORT_UCHAR or even just the "In" instruction(which I know isn't a good
method at all but it does work and for my purposes it can be used since no
other devices will be communicating with the port at the same time...)

Jon

Willy Denoyette [MVP] · Sep 24, 2007

Jon Slaughter said:
No, I do know because I can define my own protocol for which lines do
which. And for programming in ICSP the device is never busy.. hell, its
not even technically on yet... cause your programming it.

I think you do not understand whats going on and think that the parallel
port has to follow a certain protocol. This is not true in SPP mode. You
are at full control of what lines to do. There is really no such thing as
a busy line. Its just a pin on the port that is an open collector. You can
use it for whatever you want if. If you want to communicate using the
"printer protocol" then you'll need it for busy... else you can use it for
sending information(its not so good for receving because it sorta requires
polling).

The driver I'm using is direct port access... there is no bus and no other
drivers. Its exactly the same as if it was done in dos. Check out InpOut32
for what I'm talking about. Or even PortTalk(which I don't use).

The parallel port in SPP is essentially 8 data lines that are configured
for output but also can be configured for input. 5 status lines that are
configured for input and 5 control lines that are mainly output but since
they are open collector can also be used for input.

The SPP communications protocol is software implemented deifnes what those
pins functions are.... it is not obligatory to use them. In ECP and EPP
modes it is different because much of the protocol is in hardware.

No, your assuming to much. I think your getting SPP mode confused with the
other two which are in the hardware.

http://www.beyondlogic.org/spp/parallel.htm

"Centronics is an early standard for transferring data from a host to the
printer. The majority of printers use this handshake. This handshake is
normally implemented using a Standard Parallel Port under software
control."

Ok, I do not know what "normally" means but on the pc it is in the
software.

What this means is exactly what I said above in that you are in full
control of the "protocol". So I can use my own.

Also there is no filter drivers or anything. I'm in direct control of the
parallel port since I bypass all other drivers. Maybe there is some
virtualization going on, I do not know... but in the driver I can use
either READ_PORT_UCHAR or even just the "In" instruction(which I know
isn't a good method at all but it does work and for my purposes it can be
used since no other devices will be communicating with the port at the
same time...)

Jon

Please don't teach me how I should write device drivers I have done it for
years for the living at Digital Equipment, Compaq and HP., when I'm talking
about SPP mode I know what I'm talking about, it's not any of the other
modes like nibble, byte, EPP, ECP and other..
The SPP protocol is a PROTOCOL mode, it stands for "Standard Parallel
Protocol", also known as "Compatible" "Centronix" or "Standard" mode", you
don't use the SPP mode because 1) your device is NOT conforming (it doesn't
toggle the busy and what else), it's a raw device, and 2) your driver
(InpOut32 ) is not conforming either, it doesn't check the control and
status registers of the PIC, it's a very simple driver, nothing to be used
in a professional application.
That said, your device is used in raw mode, doesn't use any handshaking, and
as result, you have a problem.
One way to solve this is by inserting a delay between each character you
output, but you should do this at the driver level, not at the user level.
That means you should do something like this:

Output one char (outp(port, value)

SpinWait(x); // where x is something like 1µsec, using the kernel API's
(KeStallExecutionProcessor(x) NOT a user level SpinWait
Set Strobe control signal
SpinWait(x) // where x = 1 µsec
Reset strobe
Spinwait(y) where y is 20µsec.
repeat for every character in the buffer, and hope the device picks up the
character before you send the next.

Keep in mind not to spin for too long, and to lower the IRQ level, the OS
may preempt while in a SpinWait loop, spoiling the party.

Willy.

Jon Slaughter · Sep 24, 2007

Willy Denoyette said:
Please don't teach me how I should write device drivers I have done it for
years for the living at Digital Equipment, Compaq and HP., when I'm
talking about SPP mode I know what I'm talking about, it's not any of the
other modes like nibble, byte, EPP, ECP and other..
The SPP protocol is a PROTOCOL mode, it stands for "Standard Parallel
Protocol", also known as "Compatible" "Centronix" or "Standard" mode", you
don't use the SPP mode because 1) your device is NOT conforming (it
doesn't toggle the busy and what else), it's a raw device, and 2) your
driver (InpOut32 ) is not conforming either, it doesn't check the control
and status registers of the PIC, it's a very simple driver, nothing to be
used in a professional application.
That said, your device is used in raw mode, doesn't use any handshaking,
and as result, you have a problem.
One way to solve this is by inserting a delay between each character you
output, but you should do this at the driver level, not at the user level.
That means you should do something like this:

Ok, I guess one your one of those guys "I've been doing it for 1834 years so
shut the **** up".

I'm not going to talk to discuss it any more. if you can't admitt that your
wrong or that you don't know everything about it then no use. Maybe I miss
used the term but all my research as said SPP = standard parallel port and
not protocol. The protocol is centronix. In any case it doesn't matter
because it doesn't have to be used that way as easily be demonstrated. As I
said, one has full control over the parallel port when in "SPP" mode. All
the shit your talking about is not applicable to my problem because I am not
using centronix communications protocol with any type of handshaking.

http://www.beyondlogic.org/spp/parallel.htm

In any case I have better things to do than argue with you with you about
what you think the pc parallel port is and isn't capable of as I know it can
do what I want because I have done it to some degree and I have seen other
applications do it.

Ben Voigt [C++ MVP] · Sep 24, 2007

Yeah, I understand this. I don't understand how B can actually do any

Ah, but it is. The second case, that is. Normal Windows applications
don't have direct access to the interrupts, no. But they do have access
to methods that allow the OS itself to use the interrupts, which
implicitly provides a mechanism for the application itself to use the
interrupts.

This is, in fact, how a lot of the various i/o methods work.

Just wanted to throw in here, that depending on your registry settings (this
is true by default for workstations), whenever an I/O operation completes,
any thread waiting on that resource gets a dynamic priority boost, which can
have the effect of "interrupting" whatever thread is currently running (or
at least moving to the head of the queue).

http://msdn2.microsoft.com/en-us/library/ms684828.aspx

Ben Voigt [C++ MVP] · Sep 24, 2007

Peter Duniho said:
Have you tried? The basic parallel port driver should be protocol
agnostic, AFAIK. You open it with CreateFile(), and it's just a
read/write stream.

The driver should take care of all the data integrity stuff, while your
application can worry about the application protocol.

That's all well and good, but it's not the application protocol only Jon
needs control of, it's the framing protocol as well. It's just the wrong
layer of the OSI model.

Kind of like trying to make DHCP requests using a TCP proxy connection. You
just can't. You have complete control over the application protocol, but
since DHCP doesn't use TCP, you cannot formulate a DHCP packet with a TCP
connection.

Similarly CreateFile("LPT1:") enforces a particular framing protocol,
whereas writing to the I/O port addresses of the parallel port controller
lets you control each parallel port pin individually.

Peter Duniho · Sep 25, 2007

Ben said:
Just wanted to throw in here, that depending on your registry settings (this
is true by default for workstations), whenever an I/O operation completes,
any thread waiting on that resource gets a dynamic priority boost, which can
have the effect of "interrupting" whatever thread is currently running (or
at least moving to the head of the queue).

Well, if I recall correctly, boosting the priority of a thread never
actually preempts another thread. So it'd always be the latter (your
parenthetical remark).

But yes, you're right...depending on what a thread was waiting on, it is
not necessarily the case that when it becomes runnable it has to wait on
every other thread of the same priority. It may get to go to the head
of the line, and only have to wait for the currently executing thread to
finish its timeslice.

Pete

Peter Duniho · Sep 25, 2007

Ben said:
That's all well and good, but it's not the application protocol only Jon
needs control of, it's the framing protocol as well. It's just the wrong
layer of the OSI model.

Easy for you to say now, after Jon's elaborated on the scenario.

I didn't have that luxury when I wrote the comment.

about understanding Threads	7	May 8, 2010
Net threads and Windows OS file system	9	Oct 10, 2010
Invoking to a busy UI thread	2	Jul 31, 2007
more about asynchronous programming Model(APM)	3	May 11, 2010
Async main-thread method execution	13	Feb 15, 2007
How to stop a working thread	4	Dec 11, 2008
beginners question about UI threading	10	Aug 4, 2009
Threading a server	8	Apr 24, 2009

Explain this about threads

Jon Slaughter

Jon Slaughter

Peter Duniho

Willy Denoyette [MVP]

Rick Lones

Jon Slaughter

Jon Slaughter

Jon Slaughter

Willy Denoyette [MVP]

Jon Slaughter

Willy Denoyette [MVP]

Jon Slaughter

Ben Voigt [C++ MVP]

Ben Voigt [C++ MVP]

Peter Duniho

Peter Duniho

Ask a Question

Similar Threads