Discussion:
dummy_hcd performance / correctness
p***@public.gmane.org
2014-05-30 05:52:02 UTC
Permalink
I couldn't use dummy_hcd with g_tcm_gadget (UAS storage) with recent 3.15-rc
kernels, firstly it wouldn't even work because stream support was broken
and when I could get it to work with g_mass_storage, performance was horrible
due to the driver using jiffies instead of hrtimers.

With these two patches dummy_hcd works for me acceptably (no doubt it can
be improved even more). Without them I get 4MB/s with usb-storage
(because uas.ko wouldn't bind due to broken streams regression).
With them I get 100MB/s with uas / tcm_usb_gadget (and 25MB/s
for g_mass_storage) in an ordinary 3.15-rc8 HZ=250 tickless kernel.

Please review and consider merging, other people have complained for
the performance problem of dummy_hcd as well, e.g:

http://stackoverflow.com/questions/18606393/very-low-performance-of-g-mass-storage-virtual-usb-device
http://comments.gmane.org/gmane.linux.usb.general/86580

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
p***@public.gmane.org
2014-05-30 05:52:03 UTC
Permalink
From: Pantelis Koukousoulas <pktoss-***@public.gmane.org>

Recent commit (14aec589327a6fc4035f5327d90ac5548f501c4c) added the
"can_do_streams" field to the hcd structure but only set it to 1
in the XHCI driver.

dummy_hcd can also do streams so set can_do_streams = 1 for that
as well.

This fixes the regression that uas host driver stopped accepting
to bind to the tcm_usb_gadget under dummy_hcd after the offending
commit above.
---
drivers/usb/gadget/dummy_hcd.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/usb/gadget/dummy_hcd.c b/drivers/usb/gadget/dummy_hcd.c
index 8c06430..67ebf90 100644
--- a/drivers/usb/gadget/dummy_hcd.c
+++ b/drivers/usb/gadget/dummy_hcd.c
@@ -2544,6 +2544,8 @@ static int dummy_hcd_probe(struct platform_device *pdev)
retval = usb_add_hcd(ss_hcd, 0, 0);
if (retval)
goto put_usb3_hcd;
+
+ ss_hcd->can_do_streams = 1;
}
return 0;
--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alan Stern
2014-05-30 14:54:49 UTC
Permalink
Post by p***@public.gmane.org
Recent commit (14aec589327a6fc4035f5327d90ac5548f501c4c) added the
"can_do_streams" field to the hcd structure but only set it to 1
in the XHCI driver.
dummy_hcd can also do streams so set can_do_streams = 1 for that
as well.
This fixes the regression that uas host driver stopped accepting
to bind to the tcm_usb_gadget under dummy_hcd after the offending
commit above.
---
drivers/usb/gadget/dummy_hcd.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/usb/gadget/dummy_hcd.c b/drivers/usb/gadget/dummy_hcd.c
index 8c06430..67ebf90 100644
--- a/drivers/usb/gadget/dummy_hcd.c
+++ b/drivers/usb/gadget/dummy_hcd.c
@@ -2544,6 +2544,8 @@ static int dummy_hcd_probe(struct platform_device *pdev)
retval = usb_add_hcd(ss_hcd, 0, 0);
if (retval)
goto put_usb3_hcd;
+
+ ss_hcd->can_do_streams = 1;
This line needs to go before usb_add_hcd(), not after.
Post by p***@public.gmane.org
}
return 0;
Fix that up and then you can add:

Acked-by: Alan Stern <stern-nwvwT67g6+6dFdvTe/***@public.gmane.org>

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Greg KH
2014-05-30 23:34:43 UTC
Permalink
Post by p***@public.gmane.org
Recent commit (14aec589327a6fc4035f5327d90ac5548f501c4c) added the
"can_do_streams" field to the hcd structure but only set it to 1
in the XHCI driver.
dummy_hcd can also do streams so set can_do_streams = 1 for that
as well.
This fixes the regression that uas host driver stopped accepting
to bind to the tcm_usb_gadget under dummy_hcd after the offending
commit above.
---
drivers/usb/gadget/dummy_hcd.c | 2 ++
1 file changed, 2 insertions(+)
Meta-comment about the format of the patches. The Subject: line should
have something like:
Subject: USB: gadget:
or the like, to give people a sense of what part of the kernel you are
modifying. Also, there is no Signed-off-by: line, so please go read the
file, Documentation/SubmittingPatches for how to properly format a patch
and what signed-off-by means before resending the next round of patches.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Pantelis Koukousoulas
2014-05-31 08:31:13 UTC
Permalink
Thanks a lot, I will do as you suggested :)
Post by Greg KH
Post by p***@public.gmane.org
Recent commit (14aec589327a6fc4035f5327d90ac5548f501c4c) added the
"can_do_streams" field to the hcd structure but only set it to 1
in the XHCI driver.
dummy_hcd can also do streams so set can_do_streams = 1 for that
as well.
This fixes the regression that uas host driver stopped accepting
to bind to the tcm_usb_gadget under dummy_hcd after the offending
commit above.
---
drivers/usb/gadget/dummy_hcd.c | 2 ++
1 file changed, 2 insertions(+)
Meta-comment about the format of the patches. The Subject: line should
or the like, to give people a sense of what part of the kernel you are
modifying. Also, there is no Signed-off-by: line, so please go read the
file, Documentation/SubmittingPatches for how to properly format a patch
and what signed-off-by means before resending the next round of patches.
thanks,
greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
p***@public.gmane.org
2014-05-30 05:52:04 UTC
Permalink
From: Pantelis Koukousoulas <pktoss-***@public.gmane.org>

dummy_hcd uses jiffies and seems to assume that HZ=1000 and no
tickless behavior. This makes for some horrible performance in
ordinary desktop kernels (tickless / HZ=250) as found in distros.

This patch ports dummy_hcd to use hrtimers instead, which allows
for reasonable performance in distro kernels as well.
(Around 25MB/s for g_mass_storage, around 100MB/s for UAS).

Implementation uses the tasklet_hrtimer framework. This means
dummy_timer callback is executed in softirq context (as it was
with wheel timers) which keeps required changes to a minimum.
---
drivers/usb/gadget/dummy_hcd.c | 55 +++++++++++++++++++++++++++---------------
1 file changed, 36 insertions(+), 19 deletions(-)

diff --git a/drivers/usb/gadget/dummy_hcd.c b/drivers/usb/gadget/dummy_hcd.c
index 67ebf90..7fb6f76 100644
--- a/drivers/usb/gadget/dummy_hcd.c
+++ b/drivers/usb/gadget/dummy_hcd.c
@@ -166,7 +166,7 @@ enum dummy_rh_state {
struct dummy_hcd {
struct dummy *dum;
enum dummy_rh_state rh_state;
- struct timer_list timer;
+ struct tasklet_hrtimer ttimer;
u32 port_status;
u32 old_status;
unsigned long re_timeout;
@@ -1190,8 +1190,11 @@ static int dummy_urb_enqueue(
urb->error_count = 1; /* mark as a new urb */

/* kick the scheduler, it'll do the rest */
- if (!timer_pending(&dum_hcd->timer))
- mod_timer(&dum_hcd->timer, jiffies + 1);
+ if (!hrtimer_is_queued(&dum_hcd->ttimer.timer)) {
+ tasklet_hrtimer_start(&dum_hcd->ttimer,
+ ms_to_ktime(1),
+ HRTIMER_MODE_REL);
+ }

done:
spin_unlock_irqrestore(&dum_hcd->dum->lock, flags);
@@ -1211,8 +1214,12 @@ static int dummy_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)

rc = usb_hcd_check_unlink_urb(hcd, urb, status);
if (!rc && dum_hcd->rh_state != DUMMY_RH_RUNNING &&
- !list_empty(&dum_hcd->urbp_list))
- mod_timer(&dum_hcd->timer, jiffies);
+ !list_empty(&dum_hcd->urbp_list) &&
+ !hrtimer_is_queued(&dum_hcd->ttimer.timer)) {
+ tasklet_hrtimer_start(&dum_hcd->ttimer,
+ ns_to_ktime(100),
+ HRTIMER_MODE_REL);
+ }

spin_unlock_irqrestore(&dum_hcd->dum->lock, flags);
return rc;
@@ -1646,17 +1653,25 @@ static int handle_control_request(struct dummy_hcd *dum_hcd, struct urb *urb,
return ret_val;
}

+
/* drive both sides of the transfers; looks like irq handlers to
* both drivers except the callbacks aren't in_irq().
*/
-static void dummy_timer(unsigned long _dum_hcd)
+static enum hrtimer_restart dummy_timer(struct hrtimer *timer)
{
- struct dummy_hcd *dum_hcd = (struct dummy_hcd *) _dum_hcd;
+ struct tasklet_hrtimer *ttimer = container_of(timer,
+ struct tasklet_hrtimer,
+ timer);
+ struct dummy_hcd *dum_hcd = container_of(ttimer,
+ struct dummy_hcd,
+ ttimer);
+
struct dummy *dum = dum_hcd->dum;
struct urbp *urbp, *tmp;
unsigned long flags;
int limit, total;
int i;
+ ktime_t next_tick;

/* simplistic model for one frame's bandwidth */
switch (dum->gadget.speed) {
@@ -1675,11 +1690,9 @@ static void dummy_timer(unsigned long _dum_hcd)
break;
default:
dev_err(dummy_dev(dum_hcd), "bogus device speed\n");
- return;
+ goto out;
}

- /* FIXME if HZ != 1000 this will probably misbehave ... */
-
/* look at each urb queued by the host side driver */
spin_lock_irqsave(&dum->lock, flags);

@@ -1687,7 +1700,7 @@ static void dummy_timer(unsigned long _dum_hcd)
dev_err(dummy_dev(dum_hcd),
"timer fired with no URBs pending?\n");
spin_unlock_irqrestore(&dum->lock, flags);
- return;
+ goto out;
}

for (i = 0; i < DUMMY_ENDPOINTS; i++) {
@@ -1859,10 +1872,16 @@ return_urb:
dum_hcd->udev = NULL;
} else if (dum_hcd->rh_state == DUMMY_RH_RUNNING) {
/* want a 1 msec delay here */
- mod_timer(&dum_hcd->timer, jiffies + msecs_to_jiffies(1));
+ next_tick = ktime_add(hrtimer_get_expires(&dum_hcd->ttimer.timer),
+ ms_to_ktime(1));
+ tasklet_hrtimer_start(&dum_hcd->ttimer, next_tick,
+ HRTIMER_MODE_ABS);
}

spin_unlock_irqrestore(&dum->lock, flags);
+
+out:
+ return HRTIMER_NORESTART;
}

/*-------------------------------------------------------------------------*/
@@ -2238,7 +2257,9 @@ static int dummy_bus_resume(struct usb_hcd *hcd)
dum_hcd->rh_state = DUMMY_RH_RUNNING;
set_link_state(dum_hcd);
if (!list_empty(&dum_hcd->urbp_list))
- mod_timer(&dum_hcd->timer, jiffies);
+ tasklet_hrtimer_start(&dum_hcd->ttimer,
+ ms_to_ktime(1),
+ HRTIMER_MODE_REL);
hcd->state = HC_STATE_RUNNING;
}
spin_unlock_irq(&dum_hcd->dum->lock);
@@ -2316,9 +2337,7 @@ static DEVICE_ATTR_RO(urbs);

static int dummy_start_ss(struct dummy_hcd *dum_hcd)
{
- init_timer(&dum_hcd->timer);
- dum_hcd->timer.function = dummy_timer;
- dum_hcd->timer.data = (unsigned long)dum_hcd;
+ tasklet_hrtimer_init(&dum_hcd->ttimer, dummy_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
dum_hcd->rh_state = DUMMY_RH_RUNNING;
dum_hcd->stream_en_ep = 0;
INIT_LIST_HEAD(&dum_hcd->urbp_list);
@@ -2347,9 +2366,7 @@ static int dummy_start(struct usb_hcd *hcd)
return dummy_start_ss(dum_hcd);

spin_lock_init(&dum_hcd->dum->lock);
- init_timer(&dum_hcd->timer);
- dum_hcd->timer.function = dummy_timer;
- dum_hcd->timer.data = (unsigned long)dum_hcd;
+ tasklet_hrtimer_init(&dum_hcd->ttimer, dummy_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
dum_hcd->rh_state = DUMMY_RH_RUNNING;

INIT_LIST_HEAD(&dum_hcd->urbp_list);
--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alan Stern
2014-05-30 15:11:28 UTC
Permalink
Post by p***@public.gmane.org
dummy_hcd uses jiffies and seems to assume that HZ=1000 and no
tickless behavior. This makes for some horrible performance in
ordinary desktop kernels (tickless / HZ=250) as found in distros.
This patch ports dummy_hcd to use hrtimers instead, which allows
for reasonable performance in distro kernels as well.
(Around 25MB/s for g_mass_storage, around 100MB/s for UAS).
Implementation uses the tasklet_hrtimer framework. This means
dummy_timer callback is executed in softirq context (as it was
with wheel timers) which keeps required changes to a minimum.
Good work.
Post by p***@public.gmane.org
-static void dummy_timer(unsigned long _dum_hcd)
+static enum hrtimer_restart dummy_timer(struct hrtimer *timer)
{
- struct dummy_hcd *dum_hcd = (struct dummy_hcd *) _dum_hcd;
+ struct tasklet_hrtimer *ttimer = container_of(timer,
+ struct tasklet_hrtimer,
+ timer);
+ struct dummy_hcd *dum_hcd = container_of(ttimer,
+ struct dummy_hcd,
+ ttimer);
You don't need two statements like this. Just do:

+ struct dummy_hcd *dum_hcd = container_of(timer,
struct dummy_hcd, ttimer.timer);

By the way, the style used in this file is that continuation lines are
indented by two tab stops beyond the original line. (Sometimes four
spaces, if two tab stops would be too much.) They don't line up with
opening parens or anything like this.
Post by p***@public.gmane.org
@@ -2316,9 +2337,7 @@ static DEVICE_ATTR_RO(urbs);
static int dummy_start_ss(struct dummy_hcd *dum_hcd)
{
- init_timer(&dum_hcd->timer);
- dum_hcd->timer.function = dummy_timer;
- dum_hcd->timer.data = (unsigned long)dum_hcd;
+ tasklet_hrtimer_init(&dum_hcd->ttimer, dummy_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
dum_hcd->rh_state = DUMMY_RH_RUNNING;
dum_hcd->stream_en_ep = 0;
INIT_LIST_HEAD(&dum_hcd->urbp_list);
@@ -2347,9 +2366,7 @@ static int dummy_start(struct usb_hcd *hcd)
return dummy_start_ss(dum_hcd);
spin_lock_init(&dum_hcd->dum->lock);
- init_timer(&dum_hcd->timer);
- dum_hcd->timer.function = dummy_timer;
- dum_hcd->timer.data = (unsigned long)dum_hcd;
+ tasklet_hrtimer_init(&dum_hcd->ttimer, dummy_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
dum_hcd->rh_state = DUMMY_RH_RUNNING;
This looks weird, doesn't it? As far as I can see, the only things
that are different between dummy_start() and dummy_start_ss() are the
spin_lock_init, stream_en_ep, and device_create_file. I'd like to see
this code refactored. Maybe in a third patch?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Pantelis Koukousoulas
2014-05-31 08:25:47 UTC
Permalink
Hi,

Thanks a lot for the comments!
I will make the suggested changes and resend properly this time.
Post by Alan Stern
Post by p***@public.gmane.org
static int dummy_start_ss(struct dummy_hcd *dum_hcd)
{
- init_timer(&dum_hcd->timer);
- dum_hcd->timer.function = dummy_timer;
- dum_hcd->timer.data = (unsigned long)dum_hcd;
+ tasklet_hrtimer_init(&dum_hcd->ttimer, dummy_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
dum_hcd->rh_state = DUMMY_RH_RUNNING;
dum_hcd->stream_en_ep = 0;
INIT_LIST_HEAD(&dum_hcd->urbp_list);
@@ -2347,9 +2366,7 @@ static int dummy_start(struct usb_hcd *hcd)
return dummy_start_ss(dum_hcd);
spin_lock_init(&dum_hcd->dum->lock);
- init_timer(&dum_hcd->timer);
- dum_hcd->timer.function = dummy_timer;
- dum_hcd->timer.data = (unsigned long)dum_hcd;
+ tasklet_hrtimer_init(&dum_hcd->ttimer, dummy_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
dum_hcd->rh_state = DUMMY_RH_RUNNING;
This looks weird, doesn't it? As far as I can see, the only things
that are different between dummy_start() and dummy_start_ss() are the
spin_lock_init, stream_en_ep, and device_create_file. I'd like to see
this code refactored. Maybe in a third patch?
I agree it looks a little weird and possibly buggy (because in the superspeed
case the list and timer could get reinitialized after they have started to be
used).

I 'm happy to send a separate patch for this.

Regards,
Pantelis
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alan Stern
2014-05-30 14:53:00 UTC
Permalink
Post by p***@public.gmane.org
I couldn't use dummy_hcd with g_tcm_gadget (UAS storage) with recent 3.15-rc
kernels, firstly it wouldn't even work because stream support was broken
and when I could get it to work with g_mass_storage, performance was horrible
due to the driver using jiffies instead of hrtimers.
With these two patches dummy_hcd works for me acceptably (no doubt it can
be improved even more). Without them I get 4MB/s with usb-storage
(because uas.ko wouldn't bind due to broken streams regression).
With them I get 100MB/s with uas / tcm_usb_gadget (and 25MB/s
for g_mass_storage) in an ordinary 3.15-rc8 HZ=250 tickless kernel.
Please review and consider merging, other people have complained for
http://stackoverflow.com/questions/18606393/very-low-performance-of-g-mass-storage-virtual-usb-device
http://comments.gmane.org/gmane.linux.usb.general/86580
If someone is concerned about performance, they should not be using
dummy-hcd. It will never perform as well as the loop driver. And
that's okay, because it was never meant to be high-performance -- it
was meant to help test gadget drivers.

For testing purposes, faithful emulation is much more important than
performance. To get a good emulation, we need a timer resolution of
1000 Hz, so moving to hrtimers is a good idea. But not for the reason
mentioned in the patch description.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Pantelis Koukousoulas
2014-05-31 09:37:29 UTC
Permalink
Post by Alan Stern
If someone is concerned about performance, they should not be using
dummy-hcd. It will never perform as well as the loop driver. And
that's okay, because it was never meant to be high-performance -- it
was meant to help test gadget drivers.
For testing purposes, faithful emulation is much more important than
performance. To get a good emulation, we need a timer resolution of
1000 Hz, so moving to hrtimers is a good idea. But not for the reason
mentioned in the patch description.
Indeed, I meant performance compared to a real USB device being connected
to a real controller, not as a substitute to the loop driver of course
:) It is true
that my motivation behind this work is to make dummy_hcd more useful to
host/gadget driver authors, not to replace other facilities :)

I will update the description of the series to be more clear.

Thanks a lot for the comments :)
Pantelis
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...